Would starting APScheduler in a uwsgi app end up with one scheduler for each worker?

Question

I have a flask application in which I need the scheduling feature of APScheduler. The question is:

Where do I start the scheduler instance?

I use uwsgi+nginx to serve this application with multiple workers, wouldn't I end up with multiple instances of Scheduler that would be oblivious of each other? If this is correct, a single job would be triggered multiple times, wouldn't it?

What is the best strategy in this case so I end up with just one Scheduler instance and still be able to access the application's context from within the scheduled jobs?

This question has the same problem albeit with gunicorn instead of uwsgi, but the answer could be similar.

Below is the code defining "app" as a uwsgi callable application object. The file containing this code is called wsgi.py (not that it matters).

app = create_app(config=ProductionConfig())

def job_listener(event):
    get_ = "msg from job '%s'" % (event.job)
    logging.info(get_)

# This code below never gets invoked when I check with worker_id() == 1
# The only time it is run is with worker_id() value of 0
app.sched = Scheduler()
app.sched.add_jobstore(ShelveJobStore('/tmp/apsched_%d' % uwsgi.worker_id()), 'file')
app.sched.add_listener(job_listener,
                   events.EVENT_JOB_EXECUTED |
                   events.EVENT_JOB_MISSED |
                   events.EVENT_JOB_ERROR)
app.sched.start()

score 5 · Accepted Answer · answered Nov 14 '14 at 12:22

uWSGI has a feature called mules (see: http://uwsgi-docs.readthedocs.org/en/latest/Mules.html), you can use them to start a script under the master which is not accessible via socket. It's designed to offload a work from the main app using schedulers and for signal handling so it seems to be perfect for deploying a scheduler inside the uwsgi stack.

score 1 · Answer 2 · answered Jun 23 '14 at 15:45

1

UWSGI has a function uwsgi.worker_id(). If you start the scheduler conditionally in a specific worker, you won't end up with multiple scheduler instances.

answered Jun 23 '14 at 15:45

Alex Grönholm

5,563
29
32

1

I assume the worker_id is assigned as they are created. How can I know beforehand which id I need to qualify with to start the scheduler? – Will Jun 23 '14 at 18:48
I don't know how they are assigned. If they are assigned sequentially, you could just use worker id 0 or 1 (whichever is the first). – Alex Grönholm Jun 24 '14 at 19:27
It seems like they're indeed assigned sequentially, starting from 1. – Alex Grönholm Jun 24 '14 at 19:33
The real question is, if worker 1 is killed, will it respawn with the same ID? – Alex Grönholm Jun 24 '14 at 19:33
My research says that yes, they will reuse the IDs. – Alex Grönholm Jun 24 '14 at 19:38
Alex, thanks for your research. I will try this, if it works, I will accept your answer. – Will Jun 25 '14 at 13:24
It turns out that the worker_id() returns 0 for the first thread, and the Scheduler is instantiated. From looking at the uwsgi application log, 2 workers were created, and when they do, somehow the scheduler is not created multiple times. But when one of these workers schedule a job, it never gets executed when the time comes. – Will Jun 25 '14 at 22:30
Worker ID 0 is only returned for the "master" process. You want to add the jobs and start the scheduler in worker #1. If that doesn't work, look at the (debug) log and see what the scheduler is doing. – Alex Grönholm Jun 26 '14 at 16:08
There shouldn't be anything special about whether I start the scheduler in worker #1 vs #0. I turned on the debug and the debugger only spits out info when the uwsgi server is restarted, it almost seems like the Scheduler thread is halted after running for just a few seconds. – Will Jun 26 '14 at 16:34
IIRC uwsgi disables Python threads by default. Have you tried with --enable-threads? – Alex Grönholm Jun 27 '14 at 18:07
Yes, --enable-threads are in the config, and it is reflected in the log that python threads are enabled. This is really puzzling. – Will Jun 28 '14 at 16:48
The UWSGI docs say that if worker_id() returns 0, it means the current process is not a worker at all. That's why you should check against worker id == 1. Did you try that? If it didn't work, could you add the code you used to initialize the webapp and the scheduler? – Alex Grönholm Jun 29 '14 at 18:15
I did check for worker_id() == 1. But the code never gets called with that value. I added the code in the post above. – Will Jun 30 '14 at 03:08

Would starting APScheduler in a uwsgi app end up with one scheduler for each worker?

2 Answers2