4

Is there a way in django to do the following more efficiently when the number Entry objects is greater than 5000 entries?

models.py

class Entry(models.Model):
    user = models.TextField(db_column='User', blank=True)
    date = models.DateTimeField(blank=True)

class Color(models.Model):
    color = models.TextField(blank=True)
    entry = models.ForeignKey(Entry)

And let's say that I wanted to get all the colors for each of these entries...

entrys = Entry.objects.all()

for e in entrys:
    print e.color_set.all()

I want to be able to relate each object to a specific entry. For example, in a csv table like this.

user, color
john, blue
john, orange
bob, green
bob, red
bob, purple

It takes several seconds to look through all of my entries. Is there a better way?

Shang Wang
  • 24,909
  • 20
  • 73
  • 94
John Waller
  • 2,257
  • 4
  • 21
  • 26
  • 1
    You can go other way around. Get all the Color objects and order them by `entry` and then iterate. – AKS Mar 09 '16 at 14:37
  • 2
    You can use `prefetch_related` as per [an answer I just made on a different question](http://stackoverflow.com/a/35893764/1324033) – Sayse Mar 09 '16 at 14:39
  • To extend @Sayse comment, `selected_related` could also be an option. https://docs.djangoproject.com/en/1.9/ref/models/querysets/#select-related You basically want to do an eager load. – themanatuf Mar 09 '16 at 14:40
  • @themanatuf - `select_related` isn't an option since that doesn't work for one-to-many – Sayse Mar 09 '16 at 14:41
  • @Sayse yes, you're right, but if the orig poster follows what @AKS suggested, then `select_related` would work. – themanatuf Mar 09 '16 at 14:43

2 Answers2

10

You should use prefetch_related

entrys = Entry.objects.all().prefetch_related('color_set')

for e in entrys:
    print e.color_set.all()

Instead of doing n, queries it will do 2, one for the entries, and one for the foreign key lookups

Sayse
  • 42,633
  • 14
  • 77
  • 146
  • This does not seem to run much faster... Am I missing something? – John Waller Mar 09 '16 at 15:19
  • @JohnWaller - I'm not sure, its possible that the time issue is no longer on a database side, but maybe just taking a while to print out the values or whatever the csv part of your question was referring to – Sayse Mar 09 '16 at 15:23
  • @JohnWaller - Of course, you are still trying to return quite a large number of records from your database so there must be some time involved there – Sayse Mar 09 '16 at 15:24
2

As I commented earlier, if you just need all the colors of an Entry together, you can select all the Color objects and order them on entry:

colors = Color.objects.order_by('entry')

Now, you can loop over these objects and print them the way you want:

for color in colors:
    print(color.entry.user, color.color)

# john, blue
# john, orange
# bob, green

Or, you can extract this information as values_list

color_entries = list(colors.values_list('entry__user', 'color'))
# [('john', 'blue'), ('john', 'orange'), ('bob', 'green'), ...]
AKS
  • 18,983
  • 3
  • 43
  • 54
  • What if I have another model that is also related to entry and I wanted those values too? – John Waller Mar 09 '16 at 14:52
  • 1
    This will still create n (5000) queries since it will retrieve the `entry` on every iteration. – Sayse Mar 09 '16 at 14:53
  • But if you use `values_list` it will be efficient. Or, am I missing something with `values_list`? – AKS Mar 09 '16 at 14:56
  • It will be more efficient for a different reason, since it will be selecting the certain values instead of `*` – Sayse Mar 09 '16 at 14:56
  • And, isn't that what exactly the OP asked for :) – AKS Mar 09 '16 at 14:57
  • It results in a different query though since you no longer have the objects – Sayse Mar 09 '16 at 14:59
  • 1
    I agree. And, I like your answer better. I Was just trying to provide an alternative way and limiting it to the example provided in the question. – AKS Mar 09 '16 at 15:01