0

In Django, I can't understand why queryset's iterator() method reduces memory usage.

Django document says like below.

A QuerySet typically caches its results internally so that repeated evaluations do not result in additional queries. In contrast, iterator() will read results directly, without doing any caching at the QuerySet level (internally, the default iterator calls iterator() and caches the return value). For a QuerySet which returns a large number of objects that you only need to access once, this can result in better performance and a significant reduction in memory.

In my knowledge, however, wheather iterator() method is used or not, after evaluation, the queried rows are fetched from the database at once and loaded into memory. Isn't it the same that memory proportional to the number of rows is used, wheather the queryset do caching or not? Then what is the benifit of using iterator() method, assuming evaluating the queryset only once?

Is it because raw data fetched from the database and data that is cached (after instantiating) are stored in separate memory spaces? If so, I think I can understand that not performing caching using the iterator() method saves memory.

1 Answers1

0

When using iterator, Django uses DB cursors to get data row by row. If you use something like all and iterate on that with python, all of the records would be cached in the memory while you need one by one.

So by using iterator, you are using DB cursor, and fetching data one by one, and the other records would not be fetched all at once.

Amin
  • 2,605
  • 2
  • 7
  • 15
  • Then for fetching `n` rows from database, `iterator()` method hits the database `n` times? – Choi Deok Gyeong Feb 19 '23 at 15:53
  • Yes. Read https://stackoverflow.com/questions/4222176/why-is-iterating-through-a-large-django-queryset-consuming-massive-amounts-of-me @ChoiDeokGyeong – Amin Feb 19 '23 at 16:15
  • But when I tested the query count by `from django.db import connection` and `print(len(connection.queries))`, iterating by `iterator()` on the queryset of n rows still outputs `1`. Why not `n`? – Choi Deok Gyeong Feb 20 '23 at 08:29