5

I have a situation where my model has a Foreign Key relationship:

# models.py
class Child(models.Model):
    parent = models.ForeignKey(Parent,)

class Parent(models.Model):
    pass

and my serializer:

class ParentSerializer(serializer.ModelSerializer):
    child = serializers.SerializerMethodField('get_children_ordered')

    def get_children_ordered(self, parent):
        queryset = Child.objects.filter(parent=parent).select_related('parent')
        serialized_data = ChildSerializer(queryset, many=True, read_only=True, context=self.context)
        return serialized_data.data

    class Meta:
        model = Parent

When I call Parent in my views for N number of Parents, Django does N number of database calls inside the serializer when it grabs the children. Is there any way to get ALL children for ALL Parents to minimize the number of database calls?

I've tried this but it doesn't seem to solve my issue:

class ParentList(generics.ListAPIView):

    def get_queryset(self):
        queryset = Parent.objects.prefetch_related('child')
        return queryset

    serializer_class = ParentSerializer
    permission_classes = (permissions.IsAuthenticated,)

EDIT

I've updated the code below to reflect Alex's feedback....which solves the N+1 for one nested relationship.

# serializer.py
class ParentSerializer(serializer.ModelSerializer):
    child = serializers.SerializerMethodField('get_children_ordered')

    def get_children_ordered(self, parent):
        # The all() call should hit the cache
        serialized_data = ChildSerializer(parent.child.all(), many=True, read_only=True, context=self.context)
        return serialized_data.data

    class Meta:
            model = Parent

# views.py
class ParentList(generics.ListAPIView):

    def get_queryset(self):
        children = Prefetch('child', queryset=Child.objects.select_related('parent'))
        queryset = Parent.objects.prefetch_related(children)
        return queryset

    serializer_class = ParentSerializer
    permission_classes = (permissions.IsAuthenticated,)

Now let's say I have one more model, which is a grandchild:

# models.py
class GrandChild(models.Model):
    parent = models.ForeignKey(Child,)

class Child(models.Model):
    parent = models.ForeignKey(Parent,)

class Parent(models.Model):
    pass

If i place the following in my views.py for the Parent queryset:

queryset = Parent.objects.prefetch_related(children, 'children__grandchildren')

It doesn't look like those grandchildren are being carried on into the ChildSerializer, and thus, again I'm running another N+1 issue. Any thoughts on this one?

EDIT 2

Perhaps this will provide clarity...Maybe the reason i am still running into N + 1 database calls, is because both my children and grandchildren classes are Polymorphic.... i.e.

# models.py
class GrandChild(PolymorphicModel):
    child = models.ForeignKey(Child,)

class GrandSon(GrandChild):
    pass

class GrandDaughter(GrandChild):
    pass

class Child(PolymorphicModel):
    parent = models.ForeignKey(Parent,)

class Son(Child):
    pass

class Daughter(Child):
    pass

class Parent(models.Model):
    pass

and my serializers look more like this:

# serializer.py
class ChildSerializer(serializer.ModelSerializer):
    grandchild = serializers.SerializerMethodField('get_children_ordered')

    def to_representation(self, value):
        if isinstance(value, Son):
            return SonSerializer(value, context=self.context).to_representation(value)
        if isinstance(value, Daughter):
            return DaughterSerializer(value, context=self.context).to_representation(value)

    class Meta:
        model = Child

class ParentSerializer(serializer.ModelSerializer):
    child = serializers.SerializerMethodField('get_children_ordered')

    def get_children_ordered(self, parent):
        queryset = Child.objects.filter(parent=parent).select_related('parent')
        serialized_data = ChildSerializer(queryset, many=True, read_only=True, context=self.context)
        return serialized_data.data

    class Meta:
        model = Parent

Plus the same for Grandaughter, Grandson, I'll spare you the details codewise, but i think you get the picture.

When i run my view for ParentList, and i monitor DB queries, I'm getting something along the lines of 1000s of queries, for only a handful of parents.

If i run the same code in the django shell, i can accomplish the same query at no more than 25 queries. I suspect maybe it has something to do with the fact that I'm using the django-polymorphic library? The reason being is that, there's a Child and GrandChild database table, in additions to each Son/Daughter, Grandson/Granddaughter table, for a total of 6 tables. across those objects. So my gut tells me i'm missing those polymorphic tables.

Or perhaps there's a more elegant solution for my daata model?

Dominooch
  • 720
  • 2
  • 8
  • 20
  • Possible duplicate of [Optimizing db queries in Django Rest Framework](http://stackoverflow.com/questions/26593312/optimizing-db-queries-in-django-rest-framework) – Kevin Brown-Silva Jan 28 '16 at 00:25
  • @KevinBrown, see my edit #2 above – Dominooch Feb 05 '16 at 00:31
  • @Dominooch my problem seems similar to yours: https://stackoverflow.com/questions/73914172/django-prefetch-related-in-nested-serializers-does-not-reduce-thousands-of-db-q The first layer of prefetching works but not for the second layer. – kevlar Oct 01 '22 at 00:34

2 Answers2

9

As far as I remember, nested serializers have access to prefetched relations, just make sure you don't modify a queryset (i.e. use all()):

class ParentSerializer(serializer.ModelSerializer):
    child = serializers.SerializerMethodField('get_children_ordered')

    def get_children_ordered(self, parent):
        # The all() call should hit the cache
        serialized_data = ChildSerializer(parent.child.all(), many=True, read_only=True, context=self.context)
        return serialized_data.data

    class Meta:
            model = Parent


class ParentList(generics.ListAPIView):

    def get_queryset(self):
        children = Prefetch('child', queryset=Child.objects.select_related('parent'))
        queryset = Parent.objects.prefetch_related(children)
        return queryset

    serializer_class = ParentSerializer
    permission_classes = (permissions.IsAuthenticated,)             
Alex Morozov
  • 5,823
  • 24
  • 28
  • Ah yes, been banging my head on this one for a while. Thanks much!! – Dominooch Jan 26 '16 at 19:12
  • Alex, one last question. I have a grandchild inside of child. If i do `grandchildren = Prefetch('grandchild', queryset=GrandChild.objects.select_related('child'))` and insert this into my Parent `queryset = Parent.objects.prefetch_related(children, grandchildren)` it throws an exception, cannot find grandchildren on Parent object. How would i do an additional nested relationship through the original Parent view? – Dominooch Jan 26 '16 at 20:32
  • Try `Prefetch('child__grandchild', queryset=GrandChild.objects.select_related('child'))`. – Alex Morozov Jan 27 '16 at 06:29
  • 1
    this was a big one in one of my projects. I thought we had prefetched the many queryset, but then when the serializer went to retrieve it, it added an `order_by` clause which meant that the prefetch cache couldn't be used. The data being retrieved tended to have 0-4 items in it, so it was simple to do the sorting in Python instead after it was out of the DB. – Michael Scott Asato Cuthbert May 05 '21 at 09:23
0

This question is a bit old but I just came across a very similar problem and managed to reduce the db calls drastically. It seems to me that Django-mptt would make things much easier for you.

One way would be to define a single model with a ForeignKey to. This way you can find out the hierarchy by it's level in the tree. For example:

class Person(MPTTModel):
    parent = TreeForeignKey('self', null=True, blank=True, related_name='children', db_index=True)  

You can find out if the object is a parent by checking if Person.level = 0. If it equals 1, it's a child, 2 grandchild, etc...

Then, you could modify your code to the following:

# serializers.py
class ChildSerializer(serializers.ModelSerializer):
    children = serializers.SerializerMethodField()

    def get_children(self, parent):
        queryset = parent.get_children()
        serialized_data = ChildSerializer(queryset, many=True, read_only=True, context=self.context)
        return serialized_data.data

# views.py
class ParentList(generics.ListAPIView):

    def get_queryset(self):
        queryset = cache_tree_children(Person.objects.all())

With this, you'll eliminate your N+1 problem. If you want to add a new ForeignKey to a genre Model, for example, you could simply modify the last line to:

queryset = cache_tree_children(Person.objects.filter(channel__slug__iexact=channel_slug).select_related('genre'))