Handle 1000 concurrent requests for Flask/Gunicorn web service

Question

I am fairly new to creating web services in Python. I have created a Flask web service successfully and run it with Gunicorn (as Flask’s built-in server is not suitable for production). This is how I run my app (with 4 worker nodes).

   gunicorn --bind 0.0.0.0:5000 My_Web_Service:app -w 4

The problem is, this only handles 4 requests at a time. I want it to be able to handle potentially 1000's of requests concurrently. Should I be using multi-threading? Any other options/suggestions?

score 6 · Answer 1 · answered Jul 12 '17 at 22:11

6

Reading the section on Workers you have to switch to an async worker, which can handle thousands of connections, if your work is IO bound. Using more processes than CPUs is not recommended.

answered Jul 12 '17 at 22:11

Daniel

42,087
4
55
81

sure! I'll give it a try! – Swapnil Jul 13 '17 at 03:03
3

Hi, so I tried using gevent async workers like this: gunicorn --bind 0.0.0.0:5000 service:app -k gevent --worker-connections 1000 This works, but it still seems to process incoming requests sequentially. For eg, when I pass 2 requests simultaneously and log times for each, I observe that the service one request at a time and starts serving another request after first one is done. – Swapnil Jul 23 '17 at 04:20
Do you understand, how gevent works? What kind of request do you answer? How do you profile your server? – Daniel Jul 23 '17 at 06:56
2

I am new to using Gunicorn and Gevent. I googled about creating async workers and a few links recommended me to use Gevent. I have used Flask to create a web service. I have defined 2 endpoints for 2 different types of processes. The problem is, if I try to send requests to both the endpoints, it waits for the first request to process and respond and then processes the second request after the first one is done. I think it's still doing Sync instead of Async. – Swapnil Aug 01 '17 at 17:38
I am facing same issue and not solved yet. Have you solved this issue? – mg52 Aug 16 '20 at 12:53

score 0 · Answer 2 · answered Dec 18 '22 at 20:16

I'd switch from Flask to FastAPI and combine it with either Async IO or (if it's not possible to find non-blocking versions for all your functions) with a multiprocessing pool (not multithreading, which would be still blocked by the GIL and thus slightly slower).

Among production servers gunicorn is probably still the best process manager, but since FastAPI needs ASGI, you need to combine it with uvicorn workers.

Handle 1000 concurrent requests for Flask/Gunicorn web service

2 Answers2