I have a Flask application that is supposed to display the result of a long running function to the user on a specified route. The result is about to change every hour or so. In order to avoid the user having to wait for the result, I want to have it cached somewhere in the application and re-compute it in specific intervals in the background (e.g. every hour) so that no user request ever has to wait for the long running computation function.
The idea I came up with to solve this is as follows, however, I am not completely sure whether this is really "safe" to do in a production environment with a multi-threaded or even multi-processed webserver such as waitress
, eventlet
, gunicorn
or what not.
To re-compute the result in the background, I use a BackgroundScheduler
from the APScheduler library.
The result is then left-appended in a collections.deque object which is registered as a module-wide variable (since there is no better possibility to save application wide globals in a Flask application as far as I know?!). Since the maximum size of the deque is set as 2, old results will pop out on the right side of the deque as new ones come in.
A Flask view now returns deque[0]
to the requester which should always be the newest result. I decided for deque
over Queue
since the latter has no built-in possibility to read the first item without removing it.
Thus, it is guaranteed that no user ever has to wait for the result because the old one only disappears from "cache" in the very moment the new one comes in.
See below for a minimal example of this. When running the script and hitting http://localhost:5000
, one can see the caching in action - "Job finished at" should never be later than 10 seconds plus some very short time for re-computing it behind "Current time", still one should never have to wait the time.sleep(5)
seconds from the job function until the request returns.
Is this a valid implementation for the given requirement that will also work in a production-ready WSGI server setting or should this be accomplished differently?
from flask import Flask
from apscheduler.schedulers.background import BackgroundScheduler
import time
import datetime
from collections import deque
# a global deque that is filled by APScheduler and read by a Flask view
deque = deque(maxlen=2)
# a function filling the deque that is executed in regular intervals by APScheduler
def some_long_running_job():
print('complicated long running job started...')
time.sleep(5)
job_finished_at = datetime.datetime.now()
deque.appendleft(job_finished_at)
# a function setting up the scheduler
def start_scheduler():
scheduler = BackgroundScheduler()
scheduler.add_job(some_long_running_job,
trigger='interval',
seconds=10,
next_run_time=datetime.datetime.utcnow(),
id='1',
name='Some Job name'
)
scheduler.start()
# a flask application
app = Flask(__name__)
# a flask route returning an item from the global deque
@app.route('/')
def display_job_result():
current_time = datetime.datetime.now()
job_finished_at = deque[0]
return '''
Current time is: {0} <br>
Job finished at: {1}
'''.format(current_time, job_finished_at)
# start the scheduler and flask server
if __name__ == '__main__':
start_scheduler()
app.run()