Django multiple annotate slows down the query

Question

I have been debugging with django debug_toolbar, if I use more than one annotate in query then it takes a lot of time for Django to fetch the query results.

class Project_First(models.Model):
   project_first_results_M2M = models.ManyToManyField(Project_First_Results)

class Project_Second(models.Model):
   project_second_results_M2M = models.ManyToManyField(Project_Second_Results)

class Project(models.Model):
    project_first_M2M = models.ManyToManyField(Project_First)
    project_second_M2M = models.ManyToManyField(Project_Second)

I m trying to count all the objects present in project_first_results_M2M of all the project_first_M2M objects.

ie, let's suppose, project_first_M2M has 3 objects of Project_First and I want to count all the total project_first_results_M2M objects present in all 3 of them.

Project.objects.all().annotate(first_res_count=Count('project_first_M2M__project_first_results_M2M',distinct=True))

Above Query works fine and would take 80ms to fetch the results. But problem occurs when i try to add an additional annotate to query.

Project.objects.all().annotate(first_res_count=Count('project_first_M2M__project_first_results_M2M',distinct=True)).annotate(second_res_count=Count('project_second_M2M__project_second_results_M2M',distinct=True))

This would take almost 4000ms to fetch the results.

Both project_second_M2M and project_first_M2M contains the same fields and the same number of objects. I even tried in vice-versa condition to the above query and query slows down only when I add additional annotate.

Is there any fast and alternate solution to achieve the same in much efficient way? maybe with raw sql queries.
I want to count all the objects of project_first_results_M2M of all project_first_M2M objects within each Project object and similarly for project_second_results_M2M

score 2 · Answer 1 · answered Jan 07 '20 at 06:34

2

Probably you can use prefetch related:

Project.objects.prefetch_related('project_first_M2M__project_first_results_M2M', 'project_second_M2M__project_second_results_M2M').annotate(first_res_count=Count('project_first_M2M__project_first_results_M2M',distinct=True)).annotate(second_res_count=Count('project_second_M2M__project_second_results_M2M',distinct=True))

answered Jan 07 '20 at 06:34

ruddra

50,746
7
78
101

I'm not sure why, but using it along `prefetch_related` increases the query amount and time. – Arbazz Hussain Jan 07 '20 at 13:15
I tested with similar model structure as yours and had 25K entry per table, i did not face such issue. FYI i used mysql for testing. – ruddra Jan 07 '20 at 13:34
Can you please paste the raw SQL query it made with `.query`? – Arbazz Hussain Jan 07 '20 at 14:11

score 0 · Answer 2 · answered Jan 07 '20 at 16:21

If you annotate on multiple different joins you generate a large amount of JOINs (in that example a total of four), which will indeed "blow up", the solution table. But it of course does not make any sense to perform such query. If you here perform a single nested JOIN, then it results in two JOINs, and you take all records that are yielded into account. - Williem

The solution proposed by Matthew Schinckel at Django 1.11 Annotating a Subquery Aggregate using subquery trick for results in a much faster and optimized way.

Django multiple annotate slows down the query

2 Answers2

Linked