I'm working on a project in which users can have one profile, each profile can have many accounts, and an account can have many payments. The important parts of the models are the following:
class Profile(models.Model):
username = models.CharField(max_length=250)
class Account(models.Model):
profile_id = models.PositiveIntegerField()
class Payment(models.Model):
amount = models.DecimalField(max_digits=8, decimal_places=2)
account_id = models.CharField() # An unique id for the account the payment belongs
I have to create an annotation in the Profile model with the Sum of the amounts of all payments from that user to manipulate this data with pandas afterwards. I managed to create a QuerySet using nested subqueries to get the job done, but this operation is extremely slow (slower than iterating through all of the profiles to calculate this value).
Here is how I did it:
payment_groups = Payment.objects.filter(account_id=OuterRef("pk")).order_by().values("account_id")
payments_total = Subquery(payment_groups.annotate(total=Sum("amount")).values("total"), output_field=models.DecimalField())
account_groups = Account.objects.filter(profile_id=OuterRef("pk")).order_by().values("profile_id")
accounts_total = Subquery(account_groups.annotate(total=Sum(paymments_total)).values("total"), output_field=models.DecimalField())
profiles = Profile.objects.all().annotate(account_total=acount_total)
What in this code is making the subqueries so slow? I have done similar subqueries before to get the sum of fields in a child model. I know that doing this in a grandchild is more complex and requires more time but it shouldn't be that slow.
Is there a way to improve the efficiency of this operation?