Django equivalent for count and group by

Question

I have a model that looks like this:

class Category(models.Model):
    name = models.CharField(max_length=60)

class Item(models.Model):
    name = models.CharField(max_length=60)
    category = models.ForeignKey(Category)

I want select count (just the count) of items for each category, so in SQL it would be as simple as this:

select category_id, count(id) from item group by category_id

Is there an equivalent of doing this "the Django way"? Or is plain SQL the only option? I am familiar with the count( ) method in Django, however I don't see how group by would fit there.

Possible duplicate of [How to query as GROUP BY in django?](http://stackoverflow.com/questions/629551/how-to-query-as-group-by-in-django) — Ciro Santilli OurBigBook.com, May 25 '16 at 11:58
@CiroSantilli巴拿馬文件六四事件法轮功 how is this a duplicate? this question was asked in 2008, and the one you are referring to is 2 years later. — Sergey Golovchenko, May 25 '16 at 23:56
Current consensus is to close by "quality": Since "quality" is not measurable, I just go by upvotes. ;-) Likely it comes down to which question hit the best newbie Google keywords on the title. — Ciro Santilli OurBigBook.com, May 26 '16 at 07:04

score 137 · Accepted Answer · answered Aug 23 '09 at 05:21

137

Here, as I just discovered, is how to do this with the Django 1.1 aggregation API:

from django.db.models import Count
theanswer = Item.objects.values('category').annotate(Count('category'))

answered Aug 23 '09 at 05:21

michael

1,757
2
12
9

4

like most things in Django, none of this quite makes sense to look at but (unlike most things in Django) once i actually tried it, it was awesome :P – jsh Oct 20 '11 at 20:49
4

note that you need to use `order_by()` if `'category'` is not the default ordering. (See Daniel's more comprehensive answer.) – Rick Westera May 08 '13 at 23:40
The reason why this works is because [`.annotate()` works slightly differently after a `.values()`](https://docs.djangoproject.com/en/dev/topics/db/aggregation/#values): "However, when a values() clause is used to constrain the columns that are returned in the result set, the method for evaluating annotations is slightly different. Instead of returning an annotated result for each result in the original QuerySet, the original results are grouped according to the unique combinations of the fields specified in the values() clause." – mgalgs Feb 06 '18 at 19:44

score 58 · Answer 2 · answered Aug 27 '09 at 15:02

Since I was a little confused about how grouping in Django 1.1 works I thought I'd elaborate here on how exactly you go about using it. First, to repeat what Michael said:

Here, as I just discovered, is how to do this with the Django 1.1 aggregation API:
from django.db.models import Count
theanswer = Item.objects.values('category').annotate(Count('category'))

Note also that you need to from django.db.models import Count!

This will select only the categories and then add an annotation called category__count. Depending on the default ordering this may be all you need, but if the default ordering uses a field other than category this will not work. The reason for this is that the fields required for ordering are also selected and make each row unique, so you won't get stuff grouped how you want it. One quick way to fix this is to reset the ordering:

Item.objects.values('category').annotate(Count('category')).order_by()

This should produce exactly the results you want. To set the name of the annotation you can use:

...annotate(mycount = Count('category'))...

Then you will have an annotation called mycount in the results.

Everything else about grouping was very straightforward to me. Be sure to check out the Django aggregation API for more detailed info.

to perform same set of action on foreign key field Item.objects.values('category__category').annotate(Count('category__category')).order_by() — Mutant, Jan 26 '13 at 23:23

Carl Meyer · Answer 3 · 2009-08-29T15:58:41.500

(Update: Full ORM aggregation support is now included in Django 1.1. True to the below warning about using private APIs, the method documented here no longer works in post-1.1 versions of Django. I haven't dug in to figure out why; if you're on 1.1 or later you should use the real aggregation API anyway.)

The core aggregation support was already there in 1.0; it's just undocumented, unsupported, and doesn't have a friendly API on top of it yet. But here's how you can use it anyway until 1.1 arrives (at your own risk, and in full knowledge that the query.group_by attribute is not part of a public API and could change):

query_set = Item.objects.extra(select={'count': 'count(1)'}, 
                               order_by=['-count']).values('count', 'category')
query_set.query.group_by = ['category_id']

If you then iterate over query_set, each returned value will be a dictionary with a "category" key and a "count" key.

You don't have to order by -count here, that's just included to demonstrate how it's done (it has to be done in the .extra() call, not elsewhere in the queryset construction chain). Also, you could just as well say count(id) instead of count(1), but the latter may be more efficient.

Note also that when setting .query.group_by, the values must be actual DB column names ('category_id') not Django field names ('category'). This is because you're tweaking the query internals at a level where everything's in DB terms, not Django terms.

+1 for the old method. Even if currently unsupported, it's enlightening to say the least. Amazing, really. — airstrike, Jun 26 '11 at 02:12
Take a look to the Django aggregation API at https://docs.djangoproject.com/en/dev/topics/db/aggregation/#cheat-sheet.Many other complex tasks can be done with it, there you'll find some powerful examples. — serfer2, Feb 21 '14 at 12:13
@serfer2 yes, those docs are already linked from the top of this answer. — Carl Meyer, Feb 22 '14 at 16:45

S.Lott · Answer 4 · 2008-11-30T03:44:17.590

2

How's this? (Other than slow.)

counts= [ (c, Item.filter( category=c.id ).count()) for c in Category.objects.all() ]

It has the advantage of being short, even if it does fetch a lot of rows.

Edit.

The one query version. BTW, this is often faster than SELECT COUNT(*) in the database. Try it to see.

counts = defaultdict(int)
for i in Item.objects.all():
    counts[i.category] += 1

edited Nov 30 '08 at 03:44

answered Nov 29 '08 at 21:31

S.Lott

384,516
81
508
779

It is nice and short, however I would like to avoid having a separate database call for each category. – Sergey Golovchenko Nov 29 '08 at 22:38
This is a really good approach for simple cases. It falls down when you have a large dataset, and you want to order+limit (ie paginate) according to a count, without pulling down tons of unneeded data. – Carl Meyer Dec 01 '08 at 16:13
@Carl Meyer: True -- it can be doggy for a large dataset; you need to benchmark to be sure of that, however. Also, it doesn't rely on unsupported stuff either; it works in the interim until the unsupported features are supported. – S.Lott Dec 01 '08 at 16:43

Django equivalent for count and group by

4 Answers4

Linked

Related