I want to implement long polling in python using cyclone or tornado with regards to scalability of service from beginning. Clients might connect for hours to this service. My concept:
Client HTTP requests will be processed by multiple tornado/cyclone handler threads behind NGINX proxy (serving as load balancer). There will be multiple data queues for requests: one for all unprocessed requests from all clients and rest of queues containing responses specific to each connected client, previously generated by worker processes. When requests are delivered to tornado/cyclone handler threads, request data will be sent for processing to worker queue and then processed by workers (which connect to database etc.). Meanwhile tornado/cyclone handler thread will look into client-specific queue and sends response with data back to client (if there is some waiting in queue). Please see the diagram.
Simple diagram: https://i.stack.imgur.com/9ZxcA.png
I am considering queue system because some requests might be pretty heavy on database and some requests might create notifications and messages for other clients. Is this a way to go towards scalable server or is it just overkill?