debashishc debashishc

☕

Coffee please, no milk.

☕️

Applied Research Engineer Research Fellow @ Australian National University, Johns Hopkins University

8 followers · 22 following

US
14:08 (UTC -04:00)

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

debashishc / transfer_self_attention_weights.py

Created December 16, 2024 13:31

Python script demonstrating how to transfer weights from a SelfAttention_v2 instance to a SelfAttention_v1 instance in PyTorch to ensure identical outputs.

	sa_v1 = SelfAttention_v1(d_in, d_out)
	sa_v2 = SelfAttention_v2(d_in, d_out)

	# Transfer weights from sa_v2 to sa_v1
	with torch.no_grad():
	sa_v1.W_query.copy_(sa_v2.W_query.weight.T)
	sa_v1.W_key.copy_(sa_v2.W_key.weight.T)
	sa_v1.W_value.copy_(sa_v2.W_value.weight.T)

	x = torch.randn(10, d_in) # Batch size of 10

debashishc / .gitmessage

Last active September 15, 2024 11:07

Structured Git Commit Message

	# <type>(<scope>): <subject>
	#
	# <body>
	#
	# <footer>
	#
	# Types:
	# feat (new feature)
	# fix (bug fix)
	# docs (changes to documentation)

debashishc / PositionalEncoding.py

Last active January 4, 2024 05:19

	def getPositionalEncoding(seq_len, d=4, n=10000):
	PE = np.zeros((seq_len, d))
	for i in range(d // 2):
	denominator = np.power(n, 2 * i / d)
	PE[:, 2 * i] = np.sin(np.arange(seq_len) / denominator)
	PE[:, 2 * i + 1] = np.cos(np.arange(seq_len) / denominator)
	return PE

	seq_len = 2
	PE = getPositionalEncoding(seq_len=2)