1

I have been using stomp.py and stompest for years now to communicate with activemq to great effect, but this has mostly been with standalone python Daemons.

I would like to use these two libraries from the webserver to communicate with the backend, but I am having trouble finding out how to do this without creating a new connection every request.

Is there a standard approach to safely handling TCP connections in the webserver? In other languages, some sort of global object at that level would be used for connection pooling.

  • Just as a note, I am stuck using activemq because of the way our infrastructure works, so moving off is not really an option for me. – user3253945 May 25 '18 at 17:26
  • How many requests per second your django instance is (or should be) handling? – Max Malysh May 25 '18 at 17:29
  • It is an admin/mnagment app for a high-throughput system. It itself should not have more that a dozen or so users on it. – user3253945 May 25 '18 at 22:42
  • Then why are you concerned with connection pooling? – Max Malysh May 25 '18 at 22:46
  • "The standard approach" is to use Nginx in front of Django to perform connection pooling and to handle slow clients. But this shouldn't matter for an admin app, as Nginx really matters only when you are serving hundreds or thousands of requests per seconds. – Max Malysh May 25 '18 at 22:56
  • Nginx would really be for people connecting to the web application correct? I am more concerned about the web application talking some long-running processes. – user3253945 May 26 '18 at 03:03

1 Answers1

0

HTTP is a synchronous protocol. Each waiting client consumes server resources (CPU, memory, file descriptors) while waiting for a response. This means that web server has to respond quickly. HTTP web server should not block on external long-running processes when responding to a request.

The solution is to process requests asynchronously. There are two major options:

  1. Use polling.

    POST pushes a new task to a message queue:

    POST /api/generate_report
    {
         "report_id": 1337
    }
    

    GET checks the MQ (or a database) for a result:

    GET /api/report?id=1337
    {
        "ready": false
    }
    
    GET /api/report?id=1337
    {
        "ready": true,
        "report": "Lorem ipsum..."
    }
    

    Asynchronous tasks in Django ecosystem are usually implemented using Celery, but you can use any MQ directly.

  2. Use WebSockets.

Helpful links:

  1. What are Long-Polling, Websockets, Server-Sent Events (SSE) and Comet?
  2. https://en.wikipedia.org/wiki/Push_technology
  3. https://www.reddit.com/r/django/comments/4kcitl/help_in_design_for_long_running_requests/
  4. https://realpython.com/asynchronous-tasks-with-django-and-celery/
  5. https://blog.heroku.com/in_deep_with_django_channels_the_future_of_real_time_apps_in_django

Edit:

Here is a pseudocode example of how you can reuse a connection to a MQ:

projectName/appName/services.py:

import stomp

def create_connection():
    conn = stomp.Connection([('localhost', 9998)])
    conn.start()
    conn.connect(wait=True)
    return conn

print('This code will be executed only once per thread')
activemq = create_connection()

projectName/appName/views.py:

from django.http import HttpResponse
from .services import activemq

def index(request):
    activemq.send(message='foo', destination='bar')
    return HttpResponse('Success!')
Max Malysh
  • 29,384
  • 19
  • 111
  • 115
  • Ooh, I really like the look of channels. Out of scope for me, but really cool. Ultimately, what I really needed was a way to do what django-celery allows: as in take a web request and queue something in a message broker without creating a new TCP connection to the message broker every request. Except with ActiveMQ and STOMP. I am going to accept this answer because it has given me some perspective even if my issue is unresolved. – user3253945 May 27 '18 at 22:47
  • @user3253945 just make the connection object to be a module-level variable, and there will be just one connection per thread. The connection will be reused on each request. – Max Malysh May 27 '18 at 23:26
  • Module-level variables are evaluated just once (during the first import). They are kind of "global" objects. So create a connection in another module (module == any `.py` file) and import it everywhere you need it. I've added an example. – Max Malysh May 27 '18 at 23:43
  • That is not a bad idea. Probably going to require some careful coding to avoid getting stuck in a bad state and to deal with Keepalives, but it might work. Definitely means that I will have to limit gunicorn's thread count to make sure I do not have too many active connections. Thanks! – user3253945 May 28 '18 at 03:28