Django nested subqueries extremely slow

Question

I'm working on a project in which users can have one profile, each profile can have many accounts, and an account can have many payments. The important parts of the models are the following:

class Profile(models.Model):
    username = models.CharField(max_length=250)

class Account(models.Model):
    profile_id = models.PositiveIntegerField()

class Payment(models.Model):
    amount = models.DecimalField(max_digits=8, decimal_places=2)
    account_id = models.CharField() # An unique id for the account the payment belongs

I have to create an annotation in the Profile model with the Sum of the amounts of all payments from that user to manipulate this data with pandas afterwards. I managed to create a QuerySet using nested subqueries to get the job done, but this operation is extremely slow (slower than iterating through all of the profiles to calculate this value).

Here is how I did it:

payment_groups = Payment.objects.filter(account_id=OuterRef("pk")).order_by().values("account_id")

payments_total = Subquery(payment_groups.annotate(total=Sum("amount")).values("total"), output_field=models.DecimalField())

account_groups = Account.objects.filter(profile_id=OuterRef("pk")).order_by().values("profile_id")

accounts_total = Subquery(account_groups.annotate(total=Sum(paymments_total)).values("total"), output_field=models.DecimalField())

profiles = Profile.objects.all().annotate(account_total=acount_total)

What in this code is making the subqueries so slow? I have done similar subqueries before to get the sum of fields in a child model. I know that doing this in a grandchild is more complex and requires more time but it shouldn't be that slow.

Is there a way to improve the efficiency of this operation?

1) If one account can have many payments, why does Account have a ForeignKey to Payment, instead of a OneToManyField? 2) Is the goal of this query to get the balance of an individual account, or to get the balance of all accounts? — Nick ODell, Mar 21 '22 at 18:36
1) The schema I did was wrong, sorry about that. It's fixed now. I don't know why they chose to keep track of the other model's IDs the way they did. This code was made a long time ago. 2) The goal is to get the balance of each individual account — Romero Cartaxo, Mar 21 '22 at 19:16
Okay, what about something like `Payment.objects.values('account_id').order_by('account_id').annotate(balance=Sumnt('amount'))` to get the balance of each account? That turns into a GROUP BY, which are usually pretty fast. https://stackoverflow.com/questions/45547674/how-to-execute-a-group-by-count-or-sum-in-django-orm — Nick ODell, Mar 21 '22 at 19:55

score 0 · Accepted Answer · answered Feb 17 '23 at 12:17

You are experiencing a slow query because of following issues

you are using wrong Fields type for related models and explicitly you did not add db_index=True.
your Subqueries may have some mistakes ( missing [:1])

suggest

Use ForeignKey or OneToOne for your relation across mentioned models. they implicitly add db_index=True on your Database
In some situation you can add db_index=True on other non related models fields to speedup queries

here is an example

class Profile(models.Model):
    username = models.CharField(max_length=250)

class Account(models.Model):
    profile_id = models.ForeignKey(to=Profile, null=False, blank=True, on_delete=models.CASCADE)

class Payment(models.Model):
    amount = models.DecimalField(max_digits=8, decimal_places=2)
    account = models.ForeignKey(to=Account, null=False, blnak=True, on_delete=CASCADE)

payment_groups = Payment.objects.filter(account_id=OuterRef("pk")).order_by().values("account_id")

payments_total = Subquery(payment_groups.annotate(total=Sum("amount")).values("total")[:1], output_field=models.DecimalField())

account_groups = Account.objects.filter(profile_id=OuterRef("pk")).order_by().values("profile_id")

accounts_total = Subquery(account_groups.annotate(total=Sum(paymments_total)).values("total")[:1], output_field=models.DecimalField())

profiles = Profile.objects.all().annotate(account_total=acount_total)

Django nested subqueries extremely slow

1 Answers1