0

I'm writing an endpoint to fetch data from the "Term" model in Django REST framework and I'm trying to reduce queries by prefetching data. Specifically there is a model "TermRelation", that saves vector relation scores between individual terms that I would like to prefetch data from. Simplified, the models look as follows:

models.py

class Term(models.Model):
    term = models.CharField(max_length=255, verbose_name=_('Term'), null=True, db_index=True)

class TermRelation(models.Model):
    src_term = models.ForeignKey(Term, on_delete=models.CASCADE, verbose_name=_('Source term'),
                                    related_name='src_term_relation')
    trg_term = models.ForeignKey(Term, on_delete=models.CASCADE, verbose_name=_('Target term'),
                                    related_name='trg_term_relation')
    vector_sim = models.FloatField(blank=True, null=True, default=0.0, verbose_name=_('Vector similarity'), help_text=_('Cosine vector similarity.'))

And here's the simplified view:

views.py

class TermsList(generics.ListCreateAPIView):
    def get_queryset(self):
        queryset = Term.objects.prefetch_related(
            'src_term_relation',
            'trg_term_relation',
            'note_set',
            'usage_set'
        ).all()
        return queryset

There are other models related to term such as "Note" and "Usage" for which prefetch is working, only for relations it still makes a bunch of queries. I've included a screenshot of the Django SQL debug results, or rather the first few lines as this goes on for a while with the same queries. You can see that Django does run the prefetch operation, but then still makes the same queries as if it didn't happen.

What am I doing wrong? Could this be related to "TermRelation" having two ForeignKey fields pointing to the same model or REST framework not knowing how to resolve the related names?

EDIT:

Think I found something, the issue seems to lie elsewhere. In the serializer, there is a method field that counts the number of relations:

class TermSerializer(serializers.ModelSerializer):
    relations_count = serializers.SerializerMethodField()

    def get_relations_count(self, obj):
        rels = TermRelation.objects.filter(Q(src_term=obj) | Q(trg_term=obj))
        return len(rels)

    class Meta:
        model = Term
        fields = '__all__'

I'm assuming it runs a query over all TermRelations for each term that is returned by the serializer, ignoring the prefetched data. Is there a better way to do this?

Hanimir
  • 3
  • 3
  • Not related, but it's better to use TermRelation.objects.filter(Q(src_term=obj) | Q(trg_term=obj)).count() instead of doing len(), count request is much faster. – Sergey Pugach Jun 16 '23 at 10:44

1 Answers1

0

Try this, may be it would solve additional queries problem

class TermSerializer(serializers.ModelSerializer):
    relations_count = serializers.SerializerMethodField()

    def get_relations_count(self, obj):
        return obj.src_term_relation.count() + obj.trg_term_relation.count()
weAreStarsDust
  • 2,634
  • 3
  • 10
  • 22