10

I am in Python 2.7 and Django 1.7.1 with django-restframework I have an API that returns me some specific values taken fron the Database, it uses a Custom Serializer like this:

class InventarioSerializer(serializers.ModelSerializer):
    item = serializers.RelatedField(source='producto.item')
    ubicacion = serializers.RelatedField(source='ubicacion.nombre')
    class Meta:
        model = Inventario
        fields = ('epc','item','cantidad','ubicacion')

My API's view is called this way:

class ItemEnInventarioViewSet(InventarioListModelMixin, viewsets.ModelViewSet):
    serializer_class = InventarioSerializer
    renderer_classes = (UnicodeJSONRenderer,)

and my ListModelMixin is this:

class InventarioListModelMixin(object):
    def list(self, request, *args, **kwargs):
        item = request.QUERY_PARAMS.get('item', None)
        inventario = Inventario.objects.filter(producto__item = item)
        if inventario.count() == 0:
            return HttpResponse(u"El item %s no se encuentra en el inventario" % item,status=400)
        self.object_list = inventario
        # Switch between paginated or standard style responses
        page = self.paginate_queryset(self.object_list)
        if page is not None:
            serializer = self.get_pagination_serializer(page)
        else:
            serializer = self.get_serializer(self.object_list, many=True) <<--THIS IS THE PROBLEM
        return Response(serializer.data)

It works fine, but when I try to GET form the DB arround 1000 or more entries, the serializer makes it very very slow, arround 25 to 35 seconds.

The Query to the DB is very simple so the DB is not the problem at all.

If I serialize the queryset with this function "data = serializers.serialize('json', myQuerySet)" it takes at most 3 seconds but i dont get the info as I want, that's why I use a Custom Serializer

Is there a fastest way to GET that quantity of values? Maybe with another Serializer? any idea?

**ANSWER Thanks to Kevin ** Changing the query to:

inventario = Inventario.objects.select_related('producto__item','ubicacion__nombre').filter(producto__item = item)

...makes the Serializer not to hit the database every result-row to retrieve the Foreign values.

Kevin Brown-Silva
  • 40,873
  • 40
  • 203
  • 237
Alex Lord Mordor
  • 2,890
  • 7
  • 27
  • 47

2 Answers2

11

The Query to the DB is very simple so the DB is not the problem at all.

Make sure you do not have a N+1 issue with your queries. They may be simple, but if there are many of them then it will take up a considerable amount of time. I've written quite a bit about fixing performance issues in Django REST Framework on here, and you can find a lot about it by searching around.

Is there a fastest way to GET that quantity of values? Maybe with another Serializer? any idea?

If your data does not change that often, or you can deal with any possible caching issues, you may benefit greatly from adding some caching to your API. drf-extensions provides quite a few useful mixins for caching that may help you if your issue is not actually with your queries.

when I try to GET form the DB arround 1000 or more entries

I understand that your code has pagination built into it, but I want to stress the value in using pagination when working with large amounts of data. The performance in requests tends to be very linear, and the more data you have to retrieve the longer it is going to take to retrieve it all.

Community
  • 1
  • 1
Kevin Brown-Silva
  • 40,873
  • 40
  • 203
  • 237
  • 1
    I made the query directly in the DB (mssql) and it takes less than 0.25 seconds to give me the values, my data is changing a lot of times in a day, so, caching is not the best way, because i can take the cached data one or two more times, and then it changes again. The pagination sounds good but in my code `page` always is None so the `get_pagination_serializer` is never touched, how can I make pagination? is it faster? – Alex Lord Mordor Nov 14 '14 at 17:58
  • If you are just doing the main query (`Blah.objects.all()`) then you are missing the extra queries required for the relationships. Because you are using `producto` and `ubicacion`, additional queries need to be made for those. So if you are getting one object, three queries need to be made. If each query takes 100ms, that's 300ms. If you are getting 100 objects, that's 10 seconds in queries alone. This is called a N+1 query problem, and I would definitely look it up (and read the linked answers). – Kevin Brown-Silva Nov 14 '14 at 19:31
  • Django REST Framework has pagination built-in: http://www.django-rest-framework.org/api-guide/pagination – Kevin Brown-Silva Nov 14 '14 at 19:32
  • You are totally right @KevinBrown, there is a N+1 issue U_U. I was looking in the documentation, and changing my main query as you said, makes the magic, I´ll edit the main post to reflect the answer. Thank you so much, I had no idea about N+1 issues! – Alex Lord Mordor Nov 14 '14 at 19:56
10

For me, N + 1 database queries did not turn out to be the answer. It took an afternoon of profiling to pinpoint, but after doing so the answer turned out to be, frustratingly, a few DecimalField fields in my serializer.

My use case was simple: 3000-4000 instances which needed to be serialized. All select_related optimizations had been performed, however I was still seeing 2-3 seconds of serialization time rather than the .5-1.5 seconds I was expecting. After a few hours of trial and error (commenting out / uncommenting of fields), I saw a huge (50%) dip in runtime when I had all my DecimalField's commented out.

The solution, for me, was to change my DecimalField's to FloatField's. Of course you do this at the cost of a loss of precision, but for my purposes that was fine.

Bobby
  • 6,840
  • 1
  • 22
  • 25
  • I can't find anything else that suggests why DecimalField would be so slow (I know this is two years ago). Besides changing the model field definitions, is there a different solution? – JohnO Jul 11 '17 at 20:00