I am developing kind of a file-repository system in python that I want to be serving an API endpoint via HTTP.
A main class - the 'engine' - performs read and write operations to disk via appropriate methods get_data()
and put_data()
.
The object of the engine class is instanciated in the main entrypoint file containing a Flask app entrypoint with some simple routes
repo_instance = VEngine()
app = Flask(__name__)
@app.route('/api/', methods=['GET'])
def get_data():
...
repo_instance.get_data(params)
...
@app.route('/api/', methods=['POST'])
def put_data():
...
repo_instance.put_data(params)
...
I want to serve the API using Flask with gunicorn using multiple workers:
gunicorn -w 4 entrypoint:app -b 0.0.0.0:8000
Currently, the repo engine instances are obviously multiplied to all 4 worker processes:
[2021-10-07 07:32:14 +0000] [7] [INFO] Starting gunicorn 20.1.0
[2021-10-07 07:32:14 +0000] [7] [INFO] Listening at: http://0.0.0.0:8000 (7)
[2021-10-07 07:32:14 +0000] [7] [INFO] Using worker: sync
[2021-10-07 07:32:14 +0000] [8] [INFO] Booting worker with pid: 8
[2021-10-07 07:32:14 +0000] [24] [INFO] Booting worker with pid: 24
[2021-10-07 07:32:14 +0000] [40] [INFO] Booting worker with pid: 40
[2021-10-07 07:32:14 +0000] [56] [INFO] Booting worker with pid: 56
Engine started!
Engine started!
Engine started!
Engine started!
When it does spawn multiple workers for handling requests, how to ensure there will always be just one single instance of the engine class (making the app process-safe as it will be the sole process actually performing I/O operations) ?