What causes DRF Viewsets to perform inefficient SQL queries?

Question

Environment:

Django=="2.2"
Python=="3.6.8"
debug_toolbar=="1.11"
rest_framework=="3.8.2"

Background:

In switching away from function based DRF endpoints to class-based views / viewsets I am finding that class-based views perform slightly better without nested relations and much worse with nested relations.

Pseudocode Model Context:

Creator:
  ForeignKey.Shipments[]

Unit:
  ForeignKey.Shipments[]

Shipment:
  ForeignKey.Addresses[]
  ForeignKey.Creator
  ForeignKey.Unit

Address:
  Street
  City
  State

Noteworthy Serializer Context:

Unit:
  Meta:
    Depth = 1

Creator:
  Meta:
    Depth = Infinite

When comparing Viewsets of the above structure to the equivalent @apiview, I'm finding:

/addresses performs slightly better on SQL querycount (0.9x), with 1.1x CPU overhead (expected)
/creators performs slightly worse on SQL querycount (1.1x), with 3.9x CPU overhead (expected)
/creators/1 with a Read-Only Viewset performs slightly better on SQL querycount (0.9x), with 3.4x CPU overhead (unexpected)
/creators/1 with a Writeable Viewset performs significantly worse on SQL querycount (3.3x), with significant 5.9x CPU overhead (unexpected)
/units/1 with a Writeable Viewset performs profoundly worse on SQL querycount (4.4x), with a massive 7.9x CPU overhead (VERY unexpected)
Both forms of viewsets do not benefit from / disregard prefetch_related usage from get_queryset entirely (VERY unexpected)

Breakdown of retrieve against @apiview:

(240 queries including 235 similar and 120 duplicates)

Breakdown of retrieve against viewsets.ReadOnlyModelViewSet:

(208 queries including 200 similar and 78 duplicates )

Breakdown of retrieve against viewsets.ModelViewSet:

(803 queries including 801 similar and 801 duplicates )

All of this seems very unintuitive.

How is it possible for two types of endpoints using the exact same serializer to perform so much differently?

In comparison to Function based views it seems that Classviews and Viewsets incur a massive CPU and SQL performance hit where foreignkeys are involved for no real reason that I can find.

I expect N+1 queries to perform poorly, what I don't expect is for them to perform more poorly depending on which style of DRF endpoint serves them.

Is there an explanation for why this happens?

score 2 · Accepted Answer · answered Jun 06 '19 at 18:08

The stats here are a misnomer, this happens because the DRF browsable API performs an extra request for each method the endpoint accepts.

Correspondingly, if you were to perform a GET only @apiview at 1x reference speed, an endpoint with GET, PUT, PATCH, DELETE, would appear to perform 4x slower due to performing 3 additional requests.

DRF's browsable API obscures the true speed / querycount when more than one request method is available on an endpoint, and in order to see the real performance of an endpoint you should request it programmatically or use ?format=json.

I discovered this when I realized through profiling that the permissions class was being hit 6 times:

https://stackoverflow.com/a/52731612/784831

score -1 · Answer 2 · answered Jun 05 '19 at 21:33

-1

Take look at this answer Optimizing database queries in Django REST framework. DRf does not automatically optimize you query, you still need to do this yourself.

And if you want a little bit more performance improvement take a look at this app https://github.com/K0Te/drf-serializer-cache

answered Jun 05 '19 at 21:33

Krukas

657
3
10

I wouldn't expect DRF to optimize the query, the curious thing here is that DRF actively worsens query performance depending on how it's served. I would expect all DRF view types to perform equivalently under normal conditions. – Noi Sek Jun 05 '19 at 22:19

What causes DRF Viewsets to perform inefficient SQL queries?

2 Answers2