I have a Django REST Framework -based project where one of the endpoints is super slow. Turns out it's an N+1 queries problem, where each of the returned hundreds of rows causes an additional query. The endpoint is not supposed to do any joins or additional queries, just return the contents of a single table, filtered a bit. The model for that table contains a few foreign keys, but none of the referents are supposed to be accessed, only the ID's are supposed to be rendered as JSON and sent to the client.
A bit of trial and error reveals that it's DRF's Serializer
that causes the additional queries:
class FooSerializer(serializers.ModelSerializer):
class Meta:
model = Foo
fields = ["data_field_a", "data_field_b", "bar_code", "qux_id"]
When I comment bar_code
out in the serializer, the query becomes as fast as expected. What's surprising is thatqux_id
, even though the field is basically identical to bar_code
, doesn't cause such a slowdown. I read https://www.django-rest-framework.org/api-guide/relations/ and among other stuff, tried setting depth = 0
in the Meta
class.
Also, I understand that using select_related
or prefetch_related
would essentially serve as a quick fix for the problem, but strictly saying, those shouldn't be needed for this case, as no joins or additional queries are actually warranted, and I'm trying to understand the root cause of the problem to deepen my understanding.
So, I have a good grasp of actual cause of the slowdown (additional queries by the serializer), but less so about why the serializer decides to do those queries for one field but not for the other. My question is, how should I go on debugging this? I have no prior experience in DRF.
Edit: I was asked to share the code of the model, so here we go. I removed all the other fields, as they didn't seem to have anything to do with the bug; I could still reproduce it:
class Foo(models.Model):
bar_code = models.ForeignKey(
Bar, to_field="bar_code", db_column="bar_code", on_delete=models.CASCADE
)
qux_id = models.ForeignKey(
Qux, to_field="qux_id", db_column="qux_id", on_delete=models.CASCADE
)
(Edit: as later turns out, the difference in these two is whether the field is the primary key of the referred model or not.)
Edit2: Even more minimal:
class FooSerializer(serializers.ModelSerializer):
class Meta:
model = Foo
fields = ["bar_code"]
class Foo(models.Model):
bar_code = models.ForeignKey(
Bar, to_field="bar_code", db_column="bar_code", on_delete=models.CASCADE
)