Gunicorn Workers and Threads

Question

In terms of Gunicorn, I am aware there are various worker classes but for this conversation I am just looking at the sync and async types.

From my understanding ...

sync
workers = (2 * cpu) + 1
worker_class = sync

async (gevent)
workers = 1
worker_class = gevent
worker_connections = a value (lets say 2000)

So (based on a 4 core system) using sync workers I can have a maximum of 9 connections processing in parallel. With Async I can have up to 2000, with the caveats that come with async.

Questions

So where do threads fit in? Can I add threads to both the sync and async worker types?
What is the best option around gunicorn workers? Should I wish to place gunicorn in front of a Django API, with the requirement of processing 100s of requests in parallel?
Are gevent and sync worker classes thread safe?

score 310 · Accepted Answer · edited Jun 07 '21 at 18:03

Let me attempt an answer. Let us assume that at the beginning my deployment only has a single gunicorn worker. This allows me to handle only one request at a time. My worker's work is just to make a call to google.com and get the search results for a query. Now I want to increase my throughput. I have the below options:

Keep one worker only and increase number of threads in that worker

This is the easiest. Since threads are more lightweight (less memory consumption) than processes, I keep only one worker and add several threads to that. Gunicorn will ensure that the master can then send more than one requests to the worker. Since the worker is multithreaded, it is able to handle 4 requests. Fantastic. Now why would I need more workers ever?

To answer that, assume that I need to do some work on the search results that google returned. For instance I might also want to calculate a prime number for each result query. Now I am making my workload compute bound and I hit the problem with python's global interpreter lock. Even though I have 4 threads, only one thread can actually process the results at a time. This means to get true parallel performance I need more than one worker.

Increase Number of workers but all workers are single threaded

So why I need this would be when I need to get true parallel processing. Each worker can parallely make a call to google.com, get results and do any processing. All in parallel. Fantastic. But the downside is that processes are heavier, and my system might not keep up with the demands of increasing workers to accomplish parallelism. So the best solution is to increase workers and also add more threads to each worker.

Increase Number of workers and each worker is multithreaded

I guess this needs no further explanation.

Change worker type to Async

Now why would I ever want to do this? To answer, remember that even threads consume memory. There are coroutines (a radical construct that you can look up) implemented by gevent library that allow you to get threads without having to create threads. SO if you craft your gunicorn to use worker-type of gevent, you get the benefit of NOT having to create threads in your workers. Assume that you are getting threads w/o having to explicitly create them.

So, to answer your question, if you are using worker_type of anything other than Sync, you do not need to increase the number of threads in your gunicorn configuration. You can do it, by all means, but it kinda defeats the purpose.

Hope this helped.

I will also attempt to answer the specific questions.

No, the threaded option is not present for the Async worker class. This actually needs to be made clearer through the documentation. Wondering why that has not happened.
This is a question that needs more knowledge of your specific application. If the processing of these 100s of parallel requests just involves I/O kind of operations, like fetching from DB, saving, collecting data from some other application, then you can make use of the threaded worker. But if that is not the case and you want to execute on a n core CPU because the tasks are extremely compute bound, maybe like calculating primes, you need to make use of the Sync worker. The reasoning for Async is slightly different. To use Async, you need to be sure that your processing is not compute bound, this means you will not be able to make use of multiple cores. Advantage you get is that the memory that multiple threads would take would not be there. But you have other issues like non monkey patched libraries. Move to Async only if the threaded worker does not meet your requirements.
Sync, non threaded workers are the best option if you want absolute thread safety amongst your libraries.

But due to GIL why not always run async even if compute bound. Thread safety is already guarantedd. — garg10may, Nov 16 '17 at 13:15
Placing myself in the shoes of one that might decide to do this, I would be scared of what modules might not be monkey patched correctly in order to work predictably when Async worker class is used. Async for all its benefits does come with its own risks. You must make absolutely sure that all your code is monkey patched and no native code runs. So that risk would prevent me from going Async always. — abhayAndPoorvisDad, Nov 20 '17 at 08:55
nice answer. it's still unclear this to me, why/how libraries should be monkey patched and if there is some way to check if they are? — Paolo, Feb 18 '18 at 02:05
Do threaded workers pose a significant risk of thread safety bugs compared to non-threaded workers? Do libraries need to be monkey-patched for sync threaded workers? — Antony Mativos, Jun 28 '19 at 11:52
**Increase Number of workers and each worker is multithreaded**, When I did that I found that total number of threads that i specify is getting shared to all worker nodes. In other words, each worker is not creating specified number of threads that I provide, Is this the expected behaviour? — Shiv Krishna Jaiswal, Aug 13 '21 at 13:12
Btw, if you're running code that uses modules like Numpy, or you write your own C/C++ code for use from Python, you can control the GIL and therefore achieve better concurrency with threads. — SonarJetLens, Aug 26 '21 at 08:42
I would like to understand this statement "Even though I have 4 threads, only one thread can actually process the results at a time. " Why is that? — vbfh, Oct 19 '21 at 00:17
What that means is that each thread upon executing takes the lock for interpreter, and thus even though you might have 4 cores, python forces the concurrency to 1. This is a problem with the python interpreter. GIL https://realpython.com/python-gil/ — abhayAndPoorvisDad, Oct 20 '21 at 06:35
I am using fastapi with async functinos. so, i am using uvicorn workers with gunicorn. but during load-testing more than 46% of the requests fail and cpu usage never reaches past 30% for any of my 8T/4C cpu. how do i fix this? more info: https://stackoverflow.com/questions/70912912/gunicorn-doesnt-use-all-cpu-resulting-in-lot-of-failed-requests — Naveen Reddy Marthala, Jan 30 '22 at 17:17
The original response implies you can use a default synchronous gunicorn worker with multiple threads. I am confused by this. According to gunicorn docs, the `--threads` setting only impacts the `gthread` worker type, which they say uses an event loop, so it's an asynchronous worker. As far as I can tell, you can't have "multi-threaded synchronous" workers in gunicorn. Is this correct? — Ely, May 22 '23 at 17:08

Gunicorn Workers and Threads

1 Answers1

Keep one worker only and increase number of threads in that worker

Increase Number of workers but all workers are single threaded

Increase Number of workers and each worker is multithreaded

Change worker type to Async

Linked