1

I have been scraping a data for every 30 seconds and storing it in this model.

class Frequency(models.Model):
    """Store the frequency  scraped"""
    timestamp = models.DateTimeField()
    frequency = models.DecimalField(max_digits=5, decimal_places=2)

Now i have been given a task that for every 15 minutes of a day i have to average out the results and group into something like 08:15-08:30, 08:30-08:45 .... 23:45-24:00.

What i thought is to use two loops. The outer one will loop in the hours in a day and the inner one will loop in (00, 15, 30, 45) and then alter todays datetime.now() and filter it.

Is there any better way or this is fine??

1 Answers1

2

By filtering independently for each time span, you'll make 24*4 database queries for each day, which is like a lot for such a simple operation. To make it in one query you can annotate your frequencies queryset and set closest_quarter_of_hour for each object. Then you can just take average frequency with aggregate and make a SQL group by clause with Django values method.

from django.db.models import CharField, Case, When, Q, Value, Avg
from django.db.models.functions import ExtractHour, ExtractMinute

today = timezone.now().date()
frequencies = Frequency.objects.filter(timestamp__date=today).annotate(
    hour=ExtractHour("timestamp"),
    minute=ExtractMinute("timestamp"),
).annotate(
    closest_quarter_of_hour=Case(
        When(minute__gte=0, minute__lt=15, then=Value('00-15')),
        When(minute__gte=15, minute__lt=30, then=Value('15-30')),
        When(minute__gte=30, minute__lt=45, then=Value('30-45')),
        When(minute__gte=45, then=Value('45-00')),
        output_field=CharField(),
    )
).aggregate(Avg("frequency")).values("hour", "closeset_quarter_of_hour")

Result: average frequency for each 15-minutes time span done in one SQL query.

rafaljusiak
  • 1,050
  • 1
  • 10
  • 17
  • That's exactly what i needed and it is really a great solution.. However on running it, it is giving me only one value.. – ABHISHEK TIWARI May 12 '20 at 16:21
  • oh, maybe there's my mistake. please try `.values("hour", "closeset_quarter_of_hour").annotate(average=Avg("frequency"))` instead of `.aggregate(Avg("frequency")).values("hour", "closeset_quarter_of_hour")`. if it'll work then I'll update my answer – rafaljusiak May 12 '20 at 17:31
  • I ran your code again. It gave error for values(No Attribute).. I removed it and the code ran successfully. But the number of results that i got were only 4. I need the 96 results that should be generated. My approach is that i first filter out the data for day then create a outer loop in range(24) and the inner loop in (0, 15, 30, 45) and filter again and again from the previous filtered result and aggregate the Avg. Your's one looks much more efficient and i will really be greatful and if you help me in solving this. And thanks you just showed me a new way to get result. – ABHISHEK TIWARI May 12 '20 at 20:25