There are already a number of questions on creating a daemon in Python, like this one, which answer that part nicely.
So, how do you have your daemon do background work?
As you suspected, threads are an obvious answer. But there are three possible complexities.
First, there's shutdown. If you're lucky, your crunchData
function can be summarily killed at any time with no corrupted data or (too-significant) lost work. In that case:
def worker():
while True:
crunchData()
# ... somewhere in the daemon startup code ...
t = threading.Thread(target=worker)
t.daemon = True
t.start()
Notice that t.daemon
. A "daemon thread" has nothing to do with your program being a daemon; it means that you can just quit the main process, and it will be summarily killed.
But what if crunchData
can't be killed? Then you'll need to do something like this:
quitflag = False
quitlock = threading.Lock()
def worker():
while True:
with quitlock:
if quitflag:
return
crunchData()
# ... somewhere in the daemon startup code ...
t = threading.Thread(target=worker)
t.start()
# ... somewhere in the daemon shutdown code ...
with quitlock:
quitflag = True
t.join()
I'm assuming each iteration of crunchData
doesn't take that long. If it does, you may need to check quitFlag
periodically within the function itself.
Meanwhile, you want your request handler to access some data that the background thread is producing. You'll need some kind of synchronization there as well.
The obvious thing is to just use another Lock
. But there's a good chance that crunchData
is writing to its data frequently. If it holds the lock for 10 seconds at a time, the request handler may block for 10 seconds. But if it grabs and releases the lock a million times, that could take longer than the actual work.
One alternative is to double-buffer your data: Have crunchData
write into a new copy, then, when it's done, briefly grab the lock and set currentData = newData
.
Depending on your use case, a Queue
, a file, or something else might be even simpler.
Finally, crunchData
is presumably doing a lot of CPU work. You need to make sure that the request handler does very little CPU work, or each request will slow things down quite a bit as the two threads fight over the GIL. Usually this is no problem. If it is, use a multiprocessing.Process
instead of a Thread
(which makes sharing or passing the data between the two processes slightly more complicated, but still not too bad).