How to convert a multiprocess Flask/unicorn to a single multithreaded process

Question

I would like to cache a large amount of data in a Flask application. Currently it runs on K8S pods with the following unicorn.ini

bind = "0.0.0.0:5000"
workers = 10
timeout = 900
preload_app = True

To avoid caching the same data in those 10 workers I would like to know if Python supports a way to multi-thread instead of multi-process. This would be very easy in Java but I am not sure if it is possible in Python. I know that you can share cache between Python instances using the file system or other methods. However it would be a lot simpler if it is all share in the same process space.

Edited: There are couple post that suggested threads are supported in Python. This comment by Filipe Correia, or this answer in the same question.

Based on the above comment the Unicorn design document talks about workers and threads:

Since Gunicorn 19, a threads option can be used to process requests in multiple threads. Using threads assumes use of the gthread worker.

Based on how Java works, to shared some data among threads, I would need one worker and multiple threads. Based on this other link I know it is possible. So I assume I can change my gunicorn configuration as follows:

bind = "0.0.0.0:5000"
workers = 1
threads = 10
timeout = 900
preload_app = True

This should give me 1 worker and 10 threads which should be able to process the same number of request as current configuration. However the question is: Would the cache still be instantiated once and shared among all the threads? How or where should I instantiate the cache to make sure is shared among all the threads.

J_H · Answer 1 · 2022-10-12T04:15:29.943

would like to ... multi-thread instead of multi-process.

I'm not sure you really want that. Python is rather different from Java.

workers = 10

One way to read that is "ten cores", sure. But another way is "wow, we get ten GILs!" The global interpreter lock must be held before the interpreter interprets a new bytecode instruction.

Ten interpreters offers significant parallelism, executing ten instructions simultaneously. Now, there are workloads dominated by async I/O, or where the interpreter calls into a C extension to do the bulk of the work. If a C thread can keep running, doing useful work in the background, and the interpreter gathers the result later, terrific. But that's not most workloads.

tl;dr: You probably want ten GILs, rather than just one.

To avoid caching the same data in those 10 workers

Right! That makes perfect sense.

Consider pushing the cache into a storage layer, or a daemon like Redis.

Or access memory-resident cache, in the context of your own process, via mmap or shmat.

When running Flask under Gunicorn, you are certainly free to set threads greater than 1, though it's likely not what you want. YMMV. Measure and see.

The comment by Filipe Correia in this [answer](https://stackoverflow.com/a/13929101/1077748) seems to suggest that threads are supported by Python. — Fabio, Oct 12 '22 at 12:58

How to convert a multiprocess Flask/unicorn to a single multithreaded process

1 Answers1