Skip to content

Instantly share code, notes, and snippets.

@jamescalam
Created June 11, 2021 19:09
Show Gist options
  • Save jamescalam/f17ef2890ee03c4237d0e95857846dee to your computer and use it in GitHub Desktop.
Save jamescalam/f17ef2890ee03c4237d0e95857846dee to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
},
"orig_nbformat": 2,
"kernelspec": {
"name": "ml",
"display_name": "ML",
"language": "python"
}
},
"nbformat": 4,
"nbformat_minor": 2,
"cells": [
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"encodings = {'input_ids': input_ids, 'attention_mask': mask, 'labels': labels}"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"class Dataset(torch.utils.data.Dataset):\n",
" def __init__(self, encodings):\n",
" # store encodings internally\n",
" self.encodings = encodings\n",
"\n",
" def __len__(self):\n",
" # return the number of samples\n",
" return self.encodings['input_ids'].shape[0]\n",
"\n",
" def __getitem__(self, i):\n",
" # return dictionary of input_ids, attention_mask, and labels for index i\n",
" return {key: tensor[i] for key, tensor in self.encodings.items()}"
]
},
{
"source": [
"Next we initialize our `Dataset`."
],
"cell_type": "markdown",
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"dataset = Dataset(encodings)"
]
},
{
"source": [
"And initialize the dataloader, which will load the data into the model during training."
],
"cell_type": "markdown",
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"loader = torch.utils.data.DataLoader(dataset, batch_size=16, shuffle=True)"
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment