I have a large read-only data structure (a graph loaded in networkx, though this shouldn't be important) that I use in my web service. The webservice is built in Flask and then served through Gunicorn. Turns out that for every gunicorn worker I spin up, that worked holds its own copy of my data-structure. Thus, my ~700mb data structure which is perfectly manageable with one worker turns into a pretty big memory hog when I have 8 of them running. Is there any way I can share this data structure between gunicorn processes so I don't have to waste so much memory?
Asked
Active
Viewed 2.4k times
55
-
1Have you considered using something like Redis to store the data and access it from each process? Would be very similar to shared memory as far as speed goes. – nathancahill Dec 02 '14 at 01:45
-
I would, but we're talking about a complex graph that there's no easy way to store in Redis (Redis has no directed edge graphs or general graph support currently AFAIK). – Eli Dec 02 '14 at 01:55
-
2Did the solution work for you? If yes can you le me know in detail, how you did it? – neel Mar 11 '16 at 06:28
1 Answers
28
It looks like the easiest way to do this is to tell gunicorn to preload your application using the preload_app
option. This assumes that you can load the data structure as a module-level variable:
from flask import Flask
from your.application import CustomDataStructure
CUSTOM_DATA_STRUCTURE = CustomDataStructure('/data/lives/here')
# @app.routes, etc.
Alternatively, you could use a memory-mapped file (if you can wrap the shared memory with your custom data structure), gevent with gunicorn to ensure that you're only using one process, or the multi-processing module to spin up your own data-structure server which you connect to using IPC.

Community
- 1
- 1

Sean Vieira
- 155,703
- 32
- 311
- 293
-
1preload option is not working, can you provide some example of how to use it with some dummy data structure? – neel Mar 10 '16 at 07:02
-
@neel - you're probably better off asking another question with an example of your setup and what's not working. – Sean Vieira Mar 10 '16 at 15:39
-
1I have posted the question here http://stackoverflow.com/questions/35914587/how-to-get-a-concurreny-of-1000-requests-with-flask-and-gunicorn It would be great if you look at it once. Thanks in advance. – neel Mar 10 '16 at 15:44
-
A great read, although didn't help me setup catch the parent process while using a Uvicorn worker, but I managed to stumble upon a solution that I think is even cleaner than the preload method, and it's using a python config file for gunicorn. `-c gconfig.py` – aliqandil Dec 20 '20 at 06:20