Skip to content

Instantly share code, notes, and snippets.

@manasRK
Created February 15, 2018 06:12
Show Gist options
  • Save manasRK/1ccc9ef06218974f187c0c60285ce8b4 to your computer and use it in GitHub Desktop.
Save manasRK/1ccc9ef06218974f187c0c60285ce8b4 to your computer and use it in GitHub Desktop.
Hack for Handling pad_packed_sequence in PyTorch

This is necessitated by the open Issue: pytorch/pytorch#1591

First, the usual pack_padded_sequence and pad_packed_sequence for handling variable length sequences;

seq_len, bsz, n_dims = feats.size()
packed_input = pack_padded_sequence(feats, lengths, batch_first=False)
packed_output, self.hidden = self.lstm(packed_input, self.hidden)
# lstm_out --> seqlen X bsz X hidden_dim
lstm_out, output_lengths = pad_packed_sequence(packed_output, batch_first = False)

Then the hack is implemented as the output size of the Variable returned by pad_packed_sequence is determined by the max length in output_lengths, not seqlen in batch. Also, you may have to hardcode MAXLEN in sequence/loss masking procedures;

if lstm_out.size(0)<seq_len:
    dummy_tensor = autograd.Variable(torch.zeros(seq_len-lstm_out.size(0), bsz, self.hidden_dim))
    lstm_out = torch.cat([lstm_out, dummy_tensor], 0)

Our accuracy metrics have remained stable and predictions in line with our expectations, so I think this hack works well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment