graph LR
A[Multitask: Pretrain Task 1, Pretrain task2] --> B((Target Task 1))
A --> C(Round Rect)
B --> D{Rhombus}
C --> D
conda env create -f environment.yml
This project uses git submodules to manage some dependencies on other research code, in particular for loading CoVe and the OpenAI transformer model. In order to make sure you get these repos when you download jiant/
, add --recursive
to your clone command:
git clone --recursive [email protected]:jsalt18-sentence-repl/jiant.git jiant
If you already cloned and just need to get the submodules, you can do:
git submodule update --init --recursive
Now, let's get started! Let's say we want to try pretraining on SST and MRPC (multitask training) and then target task training on STS-b and WNLI using a BiLSTM, We will first make a configuration file. config/defaults.conf has most of everything. The most important parts are:
- sent_enc = rnn
- bidirectional = 1
- pretrain_tasks = "rte"
- target_tasks = "sst"
- transfer_paradigm = "finetune". What this does is
If you want to add task specific parameters (like hidden dimensions, dropout, etc) you can do that like so:
sts-b += {
classifier_hid_dim = 512
classifier_dropout = 0.3
pair_attn = 0
max_vals = 16
val_interval = 100
}
You'll probably want to use a .sh file to execute and run the code, and there, you can override your config file with anything else like project_directory, exp_name, and run_name. The full config is in: https://github.com/nyu-mll/jiant/blob/master/config/demo.conf
What do the logs mean?
And there you have it! Your first experiment.