István Ketykó ketyi

NLP Reading

Deep Learning for NLP: Book by Yoav Goldberg, and a Primer version (without the NLP bits, without some of the advanced bits)
Manning and Schutze Foundations of Statistical Natural Language Processing. Buy at Amazon
- Classic book, a bit outdates by now, but some chapters are still worth reading today.
Jurafsky and Martin Speech and Language Processing (3rd Edition)

	# train_grpo.py
	#
	# See https://github.com/willccbb/verifiers for ongoing developments
	#
	"""
	citation:

	@misc{brown2025grpodemo,
	title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models},
	author={Brown, William},

	"""
	PyTorch has pack_padded_sequence this doesn’t work with dense layers. For sequence data with high variance in its length
	the best way to minimize padding and masking within a batch is by feeding in data that is already grouped by sequence length
	(while still shuffling it somewhat). Here is my current solution in numpy.
	I will need to convert every function over to torch to allow it to run on the GPU and am sure there are many other
	ways to optimize it further. Hope this helps others and that maybe it can become a new PyTorch Batch Sampler someday.

	General approach to how it works:

	Decide what your bucket boundaries for the data are.

	#!/usr/bin/env python3
	# -- coding: utf-8 --


	"""
	Most of this code is borrowed by niffler92's project.
	https://github.com/niffler92/SNGAN
	"""

	from keras.callbacks import Callback
	import keras.backend as K
	import numpy as np

	class SGDRScheduler(Callback):
	'''Cosine annealing learning rate scheduler with periodic restarts.

	# Usage
	```python
	schedule = SGDRScheduler(min_lr=1e-5,

	import tensorflow as tf
	from keras.backend.tensorflow_backend import set_session
	config = tf.ConfigProto()
	config.gpu_options.per_process_gpu_memory_fraction = 0.9
	config.gpu_options.visible_device_list = "0"
	set_session(tf.Session(config=config))

	from contextlib import contextmanager
	import numpy as np
	import torch
	from torch import Tensor, ByteTensor
	import torch.nn.functional as F
	from torch.autograd import Variable
	import pycuda.driver
	from pycuda.gl import graphics_map_flags
	from glumpy import app, gloo, gl

	#!/usr/bin/env python
	# Implementation of algorithm from http://stackoverflow.com/a/22640362/6029703
	import numpy as np
	import pylab

	def thresholding_algo(y, lag, threshold, influence):
	signals = np.zeros(len(y))
	filteredY = np.array(y)
	avgFilter = [0]*len(y)
	stdFilter = [0]*len(y)