23

I came across some strange behaviour recently, and need to check my understanding.

I'm using a simple filter in the model and then iterating over the results.

e.g.

allbooks = Book.objects.filter(author='A.A. Milne')

for book in allbooks:
   do_something(book)

oddly, it was returning only a partial list of books.

However, when using the same code and using iterator(), this seems to work well.

i.e.

for book in allbooks.iterator():
    do_something(book)

Any idea why?

p.s. I did look through the Django documentation, but can't see how the queryset would be cached already anywhere else...

iterator() Evaluates the QuerySet (by performing the query) and returns an iterator over the results. A QuerySet typically caches its results internally so that repeated evaluations do not result in additional queries; iterator() will instead read results directly, without doing any caching at the QuerySet level. For a QuerySet which returns a large number of objects, this often results in better performance and a significant reduction in memory

Note that using iterator() on a QuerySet which has already been evaluated will force it to evaluate again, repeating the query.

funnydman
  • 9,083
  • 4
  • 40
  • 55
gingerlime
  • 5,206
  • 4
  • 37
  • 66
  • another odd thing I just noticed. When not using the iterator() - it only returns 100 objects. This seems to match the fact that the first 100 objects are cached automatically by '__iter__', but I still wonder why the loop simply ends there and doesn't carry on iterating over all objects matching the filter. – gingerlime Feb 20 '11 at 20:37

2 Answers2

29

oddly, it was returning only a partial list of books.

That's not how the queryset must work. Iterating over queryset should give you every record returned by your database. Debug your code. You'll find the error, otherwise debug it again.

It's easy to check in the REPL. Run manage.py shell:

from app.models import Model
for o in Model.objects.filter(fieldname="foo"): print o

#Let's see DB query
from django.db import connection
print(connection.queries)
guettli
  • 25,042
  • 81
  • 346
  • 663
alex vasi
  • 5,304
  • 28
  • 31
  • 1
    Thanks, exactly the same code with the same database without .iterator() returns 100 records. With it - 129. Also, if I do: len(allbooks) for book in allbooks: returns 129 – gingerlime Feb 21 '11 at 05:36
  • 5
    You were right. Something inside the do_something(book) was iterating over the same allbooks... doh! – gingerlime Feb 21 '11 at 06:09
  • 1
    Amazing! I'd been trying to figure out how to access the values inside a queryset without success till I stumbled on this answer from 10 years back! Thanks! – YCode Apr 07 '21 at 03:12
2

A QuerySet typically caches its results internally so that repeated evaluations do not result in additional queries. In contrast, iterator() will read results directly, without doing any caching at the QuerySet level.

https://docs.djangoproject.com/en/dev/ref/models/querysets/

LuFFy
  • 8,799
  • 10
  • 41
  • 59