15

I wrote a machine learning application in Django so a user can specify in a form some parameters and train a model. Once the model is trained, I want to serve requests like:

curl http://localhost:8000/.../?model_input='XYZ' 

and I want Django returns the output of the model given the input XYZ. Every example I saw from Tastypie or REST framework builds its response from a queryset. How can I proceed if the response is not the result of a queryset but the result of in-memory pure calculation? In my case, the response is the result of a matrix multiplication (the trained model) by a vector (the input) and this result is not stored in a table.

What is the recommended way to manage such requests? Any help is greatly appreciated. Regards, Patrick

Patrick
  • 2,577
  • 6
  • 30
  • 53

1 Answers1

16

Django REST Framework does not require a model source, or a queryset, though it does perform its best when working with either of them. It does provide a basic Serializer for this reason, as well as basic APIView classes to allow for content negotiation to be used on top of standard Django class-based views.

You most likely won't need to use the Serializer unless you were looking to serialize the results object. The other common use for a Serializer is to validate the incoming data and convert it to an expected format.

If you were just looking to return a basic value (you didn't specify what "the result of a matrix multiplication" actually could be), then even just using the basic views is a step up from doing it all manually. The Response object that Django REST Framework provides allows you to return arbitrary data and have it be converted into a comparable JSON or XML representation, automatically. You never need to call json.dumps or coerce the data into a specific representation, the Response object does it all for you.

from rest_framework.response import Response
from rest_framework import serializers, views

class IncredibleInputSerializer(serializers.Serializer):
    model_input = serializers.CharField()

class IncredibleView(views.APIView):

    def get(self, request):
        # Validate the incoming input (provided through query parameters)
        serializer = IncredibleInputSerializer(data=request.query_params)
        serializer.is_valid(raise_exception=True)

        # Get the model input
        data = serializer.validated_data
        model_input = data["model_input"]

        # Perform the complex calculations
        complex_result = model_input + "xyz"

        # Return it in your custom format
        return Response({
            "complex_result": complex_result,
        })

In the example above, we create a IncredibleInputSerializer that validates the model_input query parameter to make sure that it is included in the request. This is a very basic example, as Django REST Framework supports doing additional things to the input, like converting it to a number or validating that it conforms to a specific format.

Of course, if you need to serialize an object or list of objects, that's where Django REST Framework excels. It doesn't have to be a model object, it can be an object with attribute or methods to get the data, or even just a basic dictionary, and Django REST Framework should be able to serialize it for you.

Kevin Brown-Silva
  • 40,873
  • 40
  • 203
  • 237
  • 1
    Thank you very much Kevin, your answer is very appreciated! I realize that this is maybe another aspect of the question but is there a way in your example to avoid the loading of the matrix each time a request is sent? In other terms, the matrix - very huge - is the same for all requests and I would like to load it once only, when the server starts. Best regards, Patrick – Patrick Jan 07 '15 at 14:16
  • 1
    Doing that is going to require some sort of global state to be set up, which will vary based on where you need the data, but [you can hook into Django's ready events](http://stackoverflow.com/a/16111968/359284) to do it. – Kevin Brown-Silva Jan 07 '15 at 14:28
  • 1
    Thanks for the link Kevin. After reading the post of Pykler I understand that some code can be executed when the server starts but I'm still not sure how the data can be *shared*. Anyway I will adress this specific question there. Thanks again, Patrick – Patrick Jan 07 '15 at 20:07
  • I had the same kind of issue you're describing Patrick and I did find a solution using [http://zeromq.org]. I have a C++ program running in the background, waiting for tasks to be sent through ZMQ (and having heavy objects loaded in memory at start up). Every time a request arrives on my django API, xyz is sent as a message to the C++ program, which performs the calculation, and sends the result back to python. Many "python clients" can send requests to the same "C++ server" at the same time. Very convenient but requires some set up. – Istopopoki Feb 03 '16 at 17:12