120

I've got a simple Model like this:

class Order(models.Model):
    created = model.DateTimeField(auto_now_add=True)
    total = models.IntegerField() # monetary value

And I want to output a month-by-month breakdown of:

  • How many sales there were in a month (COUNT)
  • The combined value (SUM)

I'm not sure what the best way to attack this is. I've seen some fairly scary-looking extra-select queries but my simple mind is telling me I might be better off just iterating numbers, starting from an arbitrary start year/month and counting up until I reach the current month, throwing out simple queries filtering for that month. More database work - less developer stress!

What makes most sense to you? Is there a nice way I can pull back a quick table of data? Or is my dirty method probably the best idea?

I'm using Django 1.3. Not sure if they've added a nicer way to GROUP_BY recently.

ayhan
  • 70,170
  • 20
  • 182
  • 203
Oli
  • 235,628
  • 64
  • 220
  • 299
  • This is a duplicate of http://stackoverflow.com/questions/8596856/django-get-distinct-dates-from-timestamp and http://stackoverflow.com/questions/1236865/grouping-dates-in-django/8597940#8597940 – tback Jan 05 '12 at 16:58

8 Answers8

275

Django 1.10 and above

Django documentation lists extra as deprecated soon. (Thanks for pointing that out @seddonym, @Lucas03). I opened a ticket and this is the solution that jarshwah provided.

from django.db.models.functions import TruncMonth
from django.db.models import Count

Sales.objects
    .annotate(month=TruncMonth('created'))  # Truncate to month and add to select list
    .values('month')                          # Group By month
    .annotate(c=Count('id'))                  # Select the count of the grouping
    .values('month', 'c')                     # (might be redundant, haven't tested) select month and count 

Older versions

from django.db import connection
from django.db.models import Sum, Count

truncate_date = connection.ops.date_trunc_sql('month', 'created')
qs = Order.objects.extra({'month':truncate_date})
report = qs.values('month').annotate(Sum('total'), Count('pk')).order_by('month')

Edits

  • Added count
  • Added information for django >= 1.10
Ali Husham
  • 816
  • 10
  • 31
tback
  • 11,138
  • 7
  • 47
  • 71
  • I don't suppose there's a nice simple way to convert the date into a proper `datetime.date`? Either in-template or as part of the query? I'm currently passing this output through a list-comprehension to sort the date out. – Oli Jan 05 '12 at 17:56
  • what about `xx.month.date()` which whould be `{{ row.month.date }}` in the template? – tback Jan 05 '12 at 18:02
  • Yeah that's what I started with but `month` in the queryset is just a string (a side-effect of being database-generated, I assume). The `|date` filter doesn't know what to do with it. – Oli Jan 05 '12 at 18:08
  • 1
    what database backend are you using - it works fine in postgres `>>> qs.extra({'month':td}).values('month').annotate(Sum('total')) [{'total__sum': Decimal('1234.56'), 'month': datetime.datetime(2011, 12, 1, 0, 0)}]` – tback Jan 05 '12 at 18:25
  • If you use this method, you should be aware that the `extra` method is likely to be deprecated, and should only be used as a last resort. See the Django docs: https://docs.djangoproject.com/en/1.9/ref/models/querysets/#extra – seddonym Apr 21 '16 at 08:19
  • @tback hmm, extra should be deprecated in the future. – Lucas03 May 19 '16 at 11:41
  • @Lucas03 Fixed (Thanks to jarshwah) – tback May 23 '16 at 06:45
  • 1
    @seddonym Fixed (Thanks to jarshwah) – tback May 23 '16 at 06:45
  • does not work in recent versions of Django. truncmonth is no more present in django – Sudhakaran Packianathan Jun 17 '16 at 06:14
  • @SudhakaranPackianathan: Django 1.10 is scheduled for release in August 2016. Until then the method described under `older versions` should work just fine. – tback Jun 17 '16 at 10:38
  • 1
    Truncmonth is not available in Django 1.8 – Sudhakaran Packianathan Jun 17 '16 at 15:58
  • Django 1.10 was released on 1st of August 2016. @SudhakaranPackianathan: Django 1.8 (== Django 1.08) is two versions earlier then Django 1.10. – tback Sep 08 '16 at 08:53
  • I ran into problems with this approach when dealing with datetimes with multiple year values – Turtle Dec 01 '16 at 21:49
  • 2
    thanks, works great. corner case for the pre-1.10 version: if one joins/filters on other models which might have the same field (e.g.: timestamp), then one must fully qualify the field - `'{}.timestamp'.format(model._meta.db_table)` – zsepi Jan 13 '17 at 12:13
  • What can I do if i want to group by a specific day gap, like 7 days? – zubhav Feb 09 '17 at 23:03
  • 1
    Just a quick note that if the Django `USE_TZ` setting is `True`, the two versions are not exactly equivalent. The version using `TruncMonth` will convert the timestamp to the time zone specified by the `TIME_ZONE` setting before truncating, while the version using `date_trunc_sql` will truncate the raw UTC timestamp in the database. – Daniel Harding Feb 20 '18 at 13:42
  • `.annotate(month=TruncMonth('timestamp'))` is confusing because you are referencing a column name, but timestamp is a python type. – blindguy May 21 '21 at 17:59
57

Just a small addition to @tback answer: It didn't work for me with Django 1.10.6 and postgres. I added order_by() at the end to fix it.

from django.db.models.functions import TruncMonth
Sales.objects
    .annotate(month=TruncMonth('timestamp'))  # Truncate to month and add to select list
    .values('month')                          # Group By month
    .annotate(c=Count('id'))                  # Select the count of the grouping
    .order_by()
Rani
  • 6,424
  • 1
  • 23
  • 31
  • 1
    yup: https://docs.djangoproject.com/en/1.11/topics/db/aggregation/#interaction-with-default-ordering-or-order-by ... doesn't feel like good design but they're very very smart those django guys, so it actually is. – Williams Jun 29 '17 at 06:01
  • 1
    `TruncDate` allows you to group by date (day of month) – Neil Mar 12 '20 at 08:11
  • This did return the exact output I wanted to display. Thanks :) – Tareq Monwer Apr 15 '21 at 09:31
  • 1
    `.annotate(month=TruncMonth('timestamp'))` is confusing because you are referencing a column name, but timestamp is a python type. – blindguy May 21 '21 at 17:59
  • YES! It worked out. Postgres users, pay attention! – Artem Dumanov Oct 05 '21 at 11:08
  • It worked by adding `order_by` for Postgres DB. wow, not sure why, but that was the reason behind it. Finally able to fix it after spending 2 hours on the documentation. :( thanks. – A.J. May 04 '22 at 15:39
  • My db is very huge, a couple of millions of records. It is vey slow to be used in action. Any, idea? – keramat Aug 14 '22 at 16:38
14

Another approach is to use ExtractMonth. I ran into trouble using TruncMonth due to only one datetime year value being returned. For example, only the months in 2009 were being returned. ExtractMonth fixed this problem perfectly and can be used like below:

from django.db.models.functions import ExtractMonth
Sales.objects
    .annotate(month=ExtractMonth('timestamp')) 
    .values('month')                          
    .annotate(count=Count('id'))                  
    .values('month', 'count')  
Turtle
  • 1,369
  • 1
  • 17
  • 32
  • 2
    This does't work as soon as your data spans across multiple years as the months each year stay the same causing them to group together even when the year itself is different. – Spoontech Sep 07 '21 at 07:56
4
    metrics = {
        'sales_sum': Sum('total'),
    }
    queryset = Order.objects.values('created__month')
                               .annotate(**metrics)
                               .order_by('created__month')

The queryset is a list of Order, one line per month, combining the sum of sales: sales_sum

@Django 2.1.7

C.K.
  • 4,348
  • 29
  • 43
1

By month:

 Order.objects.filter().extra({'month':"Extract(month from created)"}).values_list('month').annotate(Count('id'))

By Year:

 Order.objects.filter().extra({'year':"Extract(year from created)"}).values_list('year').annotate(Count('id'))

By day:

 Order.objects.filter().extra({'day':"Extract(day from created)"}).values_list('day').annotate(Count('id'))

Don't forget to import Count

from django.db.models import Count

For django < 1.10

jatinkumar patel
  • 2,920
  • 21
  • 28
  • 3
    Yeah, great practice, import all from models –  Feb 05 '19 at 15:44
  • I clearly was being ironic. It is a horrible practice to do that. You should not do it and I would have downvoted just for that (which I didn't) –  Mar 16 '19 at 11:00
  • 1
    django.db.utils.OperationalError: near "from": syntax error – keramat Aug 14 '22 at 16:54
1

Here is how you can group data by arbitrary periods of time:

from django.db.models import F, Sum
from django.db.models.functions import Extract, Cast
period_length = 60*15 # 15 minutes

# Annotate each order with a "period"
qs = Order.objects.annotate(
    timestamp=Cast(Extract('date', 'epoch'), models.IntegerField()),
    period=(F('timestamp') / period_length) * period_length,
)

# Group orders by period & calculate sum of totals for each period
qs.values('period').annotate(total=Sum(field))
Max Malysh
  • 29,384
  • 19
  • 111
  • 115
1

Here's my dirty method. It is dirty.

import datetime, decimal
from django.db.models import Count, Sum
from account.models import Order
d = []

# arbitrary starting dates
year = 2011
month = 12

cyear = datetime.date.today().year
cmonth = datetime.date.today().month

while year <= cyear:
    while (year < cyear and month <= 12) or (year == cyear and month <= cmonth):
        sales = Order.objects.filter(created__year=year, created__month=month).aggregate(Count('total'), Sum('total'))
        d.append({
            'year': year,
            'month': month,
            'sales': sales['total__count'] or 0,
            'value': decimal.Decimal(sales['total__sum'] or 0),
        })
        month += 1
    month = 1
    year += 1

There may well be a better way of looping years/months but that's not really what I care about :)

Oli
  • 235,628
  • 64
  • 220
  • 299
  • BTW It will work fine but you know loop over months also not a great idea. What if someone wants to make it in Day of a month then this loop will iterate by 30-31 days. otherwise its working fine – Mayank Pratap Singh Feb 05 '18 at 04:12
  • this is too slow if you have millions of records – John Smith Nov 20 '19 at 22:25
  • @jifferent Absolutely! I added it to show what my solution was at the time of posting the question. The other answers are much better. – Oli Nov 21 '19 at 14:02
0

i have orders table in my database . i am going to count orders per month in the last 3 months

from itertools import groupby
from dateutil.relativedelta import relativedelta

date_range = datetime.now()-relativedelta(months=3)
aggs =Orders.objects.filter(created_at=date_range)\
            .extra({'date_created':"date(created_at)"}).values('date_created')

for key , group in groupby(aggs):
     print(key,len(list(group)))

created_at is datetime field. by extra function what done is taking date from datetime values. when using datetime we may not get the count correct because objects are created at different time in a day.

The for loop will print date and number of count