5

I am aware that regular queryset or the iterator queryset methods evaluates and returns the entire data-set in one shot .

for instance, take this :

my_objects = MyObject.objects.all()
for rows in my_objects:            # Way 1
for rows in my_objects.iterator(): # Way 2

Question

In both methods all the rows are fetched in a single-go.Is there any way in djago that the queryset rows can be fetched one by one from database.

Why this weird Requirement

At present my query fetches lets says n rows but sometime i get Python and Django OperationalError (2006, 'MySQL server has gone away').

so to have a workaround for this, i am currently using a weird while looping logic.So was wondering if there is any native or inbuilt method or is my question even logical in first place!! :)

Community
  • 1
  • 1
NoobEditor
  • 15,563
  • 19
  • 81
  • 112

3 Answers3

3

I think you are looking to limit your query set.

Quote from above link:

Use a subset of Python’s array-slicing syntax to limit your QuerySet to a certain number of results. This is the equivalent of SQL’s LIMIT and OFFSET clauses.

In other words, If you start with a count you can then loop over and take slices as you require them..

cnt = MyObject.objects.count()
start_point = 0
inc = 5
while start_point + inc < cnt:
    filtered = MyObject.objects.all()[start_point:inc]
    start_point += inc

Of course you may need to error handle this more..

Sayse
  • 42,633
  • 14
  • 77
  • 146
  • I am assuming you have looked into fixing the actual issue here, that would of course be preferred :) – Sayse Sep 10 '15 at 07:05
  • this is definitely one way i have considered while digging around, looks neat.I am wondering if there is any other *native* way in django itself without limit? As for Actual issue, then, it is related to django version, but since whole project is based on it, i can not do anything about it..... – NoobEditor Sep 10 '15 at 07:14
  • @NoobEditor - I think this is about as native as it gets since this will modify the query you perform. You will still perform multiple queries on the db but only returning N results at once. The other option is to look at what you actually need to return and using `values` to just return those fields. (Also note whilst I'm only fetching 5 at a time in my example I'd imagine you can handle a lot more than this at once) – Sayse Sep 10 '15 at 07:17
  • i agree...based on the requirement, this is closest i get.Plus it has flexibility as well depending on querysize and count to retrieve!! – NoobEditor Sep 10 '15 at 07:24
  • @NoobEditor - Indeed :) This example is obviously not tested so you may need to play around with the looping logic if you decide on this approach (I think it might miss out the last few rows atm). – Sayse Sep 10 '15 at 07:29
2

Fetching row by row might be worse. You might want to retrieve in batches for 1000s etc. I have used this Django snippet (not my work) successfully with very large querysets. It doesn't eat up memory and no trouble with connections going away.

Here's the snippet from that link:

import gc

def queryset_iterator(queryset, chunksize=1000):
    '''''
    Iterate over a Django Queryset ordered by the primary key

    This method loads a maximum of chunksize (default: 1000) rows in it's
    memory at the same time while django normally would load all rows in it's
    memory. Using the iterator() method only causes it to not preload all the
    classes.

    Note that the implementation of the iterator does not support ordered query sets.
    '''
    pk = 0
    last_pk = queryset.order_by('-pk')[0].pk
    queryset = queryset.order_by('pk')
    while pk < last_pk:
        for row in queryset.filter(pk__gt=pk)[:chunksize]:
            pk = row.pk
            yield row
        gc.collect()
Pang
  • 9,564
  • 146
  • 81
  • 122
e4c5
  • 52,766
  • 11
  • 101
  • 134
1

To solve (2006, 'MySQL server has gone away') problem, your approach is not that logical. If you will hit database for each entry, it is going to increase number of queries which itself will create problem in future as usage of your application grows. I think you should close mysql connection after iterating all elements of result, and then if you will try to make another query, django will create a new connection.

from django.db import connection:
connection.close()

Refer this for more details

Amit Jaiswal
  • 985
  • 1
  • 9
  • 16
  • issue is that `connection` is getting dumped while fetching the queryset...hence the error.So even if i close and start and new connection, i face same issue again.I tried `connections.connection` & `is_usable` but no help! :\ – NoobEditor Sep 10 '15 at 07:23