How do querysets work when getting multiple random objects from Django?

Question

I need to get multiple random objects from a Django model.

I know I can get one random object from the model Person by typing:

person = Person.objects.order_by('?')[0]

Then, I saw suggestions in How to get two random records with Django saying I could simply do this by:

people = Person.objects.order_by('?')[0:n]

However, as soon as I add that [0:n], instead of returning the objects, Django returns a QuerySet object. This results in the unfortunate consequences that if I then ask for

print(people[0].first_name, people[0].last_name)

I get the first_name and last_name for 2 different people as QuerySets are evaluated as they are called (right?). How do I get the actual list of people that were returned from the first query?

I am using Python 3.4.0 and Django 1.7.1

score 0 · Answer 1 · answered Nov 30 '14 at 00:15

0

Try this ...

people = []
for person in Person.objects.order_by('?')[0:n]:
    people.append(person)

answered Nov 30 '14 at 00:15

simopopov

854
5
12

1

It works, thanks. I also found that simply saying people = list(Person.objects.order_by('?')[0:n]) works as well. – Gunnar Nov 30 '14 at 00:40
Your solution is better ... but it's first in my mind :D – simopopov Nov 30 '14 at 00:41

score 0 · Accepted Answer · answered Nov 30 '14 at 17:07

Simeon Popov's answer solves the problem, but let me explain where it comes from.

As you probably know querysets are lazy and won't be evaluated until it's necessary. They also have an internal cache that gets filled once the entire queryset is evaluated. If only a single object is taken from a queryset (or a slice with a step specified, i.e. [0:n:2]), Django evaluates it, but the results won't get cached.

Take these two examples:

Example 1

>>> people = Person.objects.order_by('?')[0:n]
>>> print(people[0].first_name, people[0].last_name)
# first and last name of different people

Example 2

>>> people = Person.objects.order_by('?')[0:n]
>>> for person in people:
>>>     print(person.first_name, person.last_name)
# first and last name are properly matched

In example 1, the queryset is not yet evaluated when you access the first item. It won't get cached, so when you access the first item again it runs another query on the database.

In the second example, the entire queryset is evaluated when you loop over it. Thus, the cache is filled and there won't be any additional database queries that would change the order of the returned items. In that case the names are properly aligned to each other.

Methods for evaluating an entire queryset are a.o. iteration, list(), bool() and len(). There are some subtle differences between these methods. If all you want to do is make sure the queryset is cached, I'd suggest using bool(), i.e.:

>>> people = Person.objects.order_by('?')[0:n]
>>> bool(people)
True
>>> print(people[0].first_name, people[0].last_name)
# matching names

This seems very strange, that you can call a function using an argument, and modify the behaviour of the argument. Is there any way to determine if a function will have this effect on a queryset? And are there more classes like querysets that have a similar property? — Gunnar, Dec 01 '14 at 00:41
@Gunnar: actually, that's quite a common concept within programming, it's called a 'mutable' object. E.g. a tuple can't be changed (is immutable), and each mutation returns a copy, but a list is mutable, and functions like `append` will actually change the list instance itself. The fact that it is `list.append(item)` instead of `append(list, item)` is purely to make it more accessible and prevent it from cluttering global namespace. Querysets are only unique because some functions (`bool()`) have an effect on, at first, seemingly unrelated properties (order) due to how caching is implemented. — knbk, Dec 01 '14 at 14:48

How do querysets work when getting multiple random objects from Django?

2 Answers2