1

Consider I have a simple model setup with JSONField

from django.db import models

import random


def simple_json_callable():
    return {"amount": random.randint(1, 100)}


def nested_json_callable():
    data = {
        "l1": {
            "l2": {
                "l3": {
                    "amount": random.randint(1, 100)
                }
            }
        }
    }
    return data


class Foo(models.Model):
    simple_json = models.JSONField(default=simple_json_callable)
    nested_json = models.JSONField(default=nested_json_callable)

I want to get the sum of amount key from both simple_json and nested_json fields.

I tried the following queries

Case 1: Annotate and then aggregate


result = Foo.objects.annotate(
    simple_json_amount=Cast('simple_json__amount', output_field=models.IntegerField()),
    nested_json_amount=Cast('nested_json__l1__l2__l3__amount', output_field=models.IntegerField()),
).aggregate(
    simple_json_total=models.Sum('simple_json_amount'),
    nested_json_total=models.Sum('nested_json_amount'),
)

Case 2: Aggregate

result = Foo.objects.aggregate(
    simple_json_total=models.Sum(Cast('simple_json__amount', output_field=models.IntegerField())),
    nested_json_total=models.Sum(Cast('nested_json__l1__l2__l3__amount', output_field=models.IntegerField())),

)

In both cases, I got the error

django.db.utils.DataError: cannot cast jsonb object to type integer

Question

What is the proper way to aggregate sum of values from a JSONField in Django?


Version

  • Django 3.1.X
  • Python 3.9.X
JPG
  • 82,442
  • 19
  • 127
  • 206

1 Answers1

2

For Django==3.2.X and newer versions

You can use the "double-underscore" to span the relationship and can get the aggregated result (as mentioned in the OP)

from django.db import models
from django.db.models.functions import Cast

result = Foo.objects.aggregate(
    simple_json_total=models.Sum(
        Cast("simple_json__amount", output_field=models.IntegerField()),
    ),
    nested_json_total=models.Sum(
        Cast("nested_json__l1__l2__l3__amount", output_field=models.IntegerField()),
    ),
)

# Result
# {'simple_json_total': 92, 'nested_json_total': 39}

For Django==3.1.X and older versions

You can use the KeyTextTransform(...) to extract the keys and aggregate the values.

from django.db import models
from django.db.models.functions import Cast
from django.db.models.fields.json import KeyTextTransform as KT

simple_kt_expr = KT('amount', 'simple_json')
nested_kt_expr = KT("amount", KT("l3", KT("l2", KT("l1", "nested_json"))))
result = Foo.objects.aggregate(
    simple_json_total=models.Sum(Cast(simple_kt_expr, output_field=models.IntegerField())),
    nested_json_total=models.Sum(Cast(nested_kt_expr, output_field=models.IntegerField())),

)

# Result
# {'simple_json_total': 92, 'nested_json_total': 39}

If you are using PostgreSQL and don't want to use the "nested usage of KeyTextTransform", you can create a custom Django DB function equivalent to the jsonb_extract_path_text(...)

from django.db.models import Aggregate


class JSONBExtractPathText(Aggregate):
    function = 'jsonb_extract_path_text'
    template = None

    def get_template(self):
        # https://stackoverflow.com/a/38985104/8283848
        paths = str(self.extra['path'])[1:-1]
        return f"%(function)s(%(jsonb_field)s, {paths})"

    def as_sql(self, *args, **kwargs):
        kwargs["template"] = self.get_template()
        return super().as_sql(*args, **kwargs)

and it can be used as

from django.db import models
from django.db.models.functions import Cast

simple_expr = JSONBExtractPathText(path=["amount"], jsonb_field="simple_json")
nested_expr = JSONBExtractPathText(path=["l1", "l2", "l3", "amount"], jsonb_field="nested_json")
result = Foo.objects.aggregate(
    simple_json_total=models.Sum(Cast(simple_expr, output_field=models.IntegerField())),
    nested_json_total=models.Sum(Cast(nested_expr, output_field=models.IntegerField())),

)

# Result
# {'simple_json_total': 92, 'nested_json_total': 39}

References

  1. Aggregate Django JSONField
  2. Django tickets - #26511, #33966
  3. Django PR - #16016
  4. String join without quotes
JPG
  • 82,442
  • 19
  • 127
  • 206