Skip to content

Instantly share code, notes, and snippets.

@rrei
Created July 29, 2020 08:40
Show Gist options
  • Save rrei/9439ed7fa416dcf80bd3bb1693f2d13c to your computer and use it in GitHub Desktop.
Save rrei/9439ed7fa416dcf80bd3bb1693f2d13c to your computer and use it in GitHub Desktop.
Break a large Django queryset into an equivalent set of smaller querysets (caveats apply)
def chunked_queryset(queryset, chunk_size=10000):
"""Slice a queryset into chunks. This is useful to avoid memory issues when
iterating through large querysets.
Code adapted from https://djangosnippets.org/snippets/10599/
"""
if not queryset.exists():
return
queryset = queryset.order_by("pk")
pks = queryset.values_list("pk", flat=True)
start_pk = pks[0]
while True:
try:
end_pk = pks.filter(pk__gte=start_pk)[chunk_size]
except IndexError:
break
yield queryset.filter(pk__gte=start_pk, pk__lt=end_pk)
start_pk = end_pk
yield queryset.filter(pk__gte=start_pk)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment