I'm not interested in stopping a program while it's running in Python, of course, control c can do that. What I'm interested in is the following situation: suppose you have a program that runs for 5 hours. You let it run for two hours but then decide that what you've done so far is worth saving but you still do not want to continue. What then is the best way to save your data and exit the program? Til now, what I do is I store a boolean in a pickle and then I open the pickle with each loop and check its value. If the boolean is true then the program keeps running, if false then the program stops and saves the data and exits. I can change the value of the boolean using a different program. However, even if the pickle is composed of just a single boolean it still seriously slows the program down, maybe as much as 10 times, since pickles take so long to open. I've thought about other solutions and I'm aware of the pdb_trace() tool but I really don't see how it can be used in this case. I'm thinking maybe setting an environment variable might help but I'm not very good with setting environment variables. Any suggestions would be appreciated.
-
1Why do you use a pickle? Isn't checking the _existence_ of a file (say, `/tmp/stop_my_program`) sufficient? – Selcuk Oct 02 '20 at 02:10
-
This depends a lot on the context of what you are doing. What sort of task is taking this long to run? – Kieran Wood Oct 02 '20 at 02:11
-
@Selcuk, cool, I think that might do the trick. – bobsmith76 Oct 02 '20 at 02:11
-
@KieranWood plenty of tasks take a very long time to run, just any of them. – bobsmith76 Oct 02 '20 at 02:12
-
1@bobsmith76 You may check not in the most internal loop, but third from internal. So that your check is once in 5-10 seconds on average. Then you'll not slow down your program. – Arty Oct 02 '20 at 02:12
-
1Even better, let your program create a file (e.g. `/tmp/foo_is_running`) when it starts, and save state and terminate if it gets deleted. This way you won't leave any artifacts after the program has exited. There will be race conditions but I guess that is not critical for your purposes. – Selcuk Oct 02 '20 at 02:12
-
1@bobsmith76 In order to be sure that you check not to often, you just can do checks in separate thread in a loop, with `time.sleep(5)` (sleep 5 seconds). – Arty Oct 02 '20 at 02:15
-
@Arty, yea, sometimes I do that. – bobsmith76 Oct 02 '20 at 02:15
-
Send a `SIGTERM` to the process and [handle](https://stackoverflow.com/questions/18499497/how-to-process-sigterm-signal-gracefully) it. – Klaus D. Oct 02 '20 at 02:22
-
how do you do that? – bobsmith76 Oct 02 '20 at 02:23
-
@bobsmith76 I've decided to implement quite complex but universal solution for your interesting task with possibility of providing any commands in separate file, like save/exit, [in this answer](https://stackoverflow.com/a/64166490/941531) – Arty Oct 02 '20 at 05:24
-
@bobsmith76 You can also implement process termination gracefully through handling SIGINT like in [this solution](https://stackoverflow.com/a/31464349/941531). SIGINT can be send to program using [this program windows-kill](https://github.com/alirdn/windows-kill/releases), by syntax `windows-kill -SIGINT PID`, where `PID` can be obtained by [microsoft's pslist](https://download.sysinternals.com/files/PSTools.zip). – Arty Oct 02 '20 at 05:39
2 Answers
Answers include checking the environment for things like variables and files. Those would all work, but could you do:
try:
main()
except KeyboardInterrupt:
save()
Or if the process for saving is the same process you’d use after main is complete, a much more robust strategy would be
try:
main()
finally:
save()
Here, save()
will run for any error, KeyboardInterrupt
or otherwise. It will also run if main()
is successful.
If you’re trying to close it with a separate program, you can send a signal.

- 211
- 2
- 6
-
-
2@bobsmith76 I think that `except KeyboardInterrupt:` should be replaced with `finally:` in order to always save work at least to some temporary file, not only on keyboard interrupt it should be saved, but also on good exit and when any errors occurs to save previous work. IMHO. – Arty Oct 02 '20 at 11:27
For your interesting task just for fun I decided to implement quite complex but universal solution of processing any commands asynchronously. Commands are provided in cmds.txt
file, single command on each line. Right now just two commands save
and exit
are supported. save
may contain second optional param after space, filename to save to (defaults to save.txt
).
If program exits abnormally (without exit
command provided) then work is saved to temporary file save.txt.tmp
.
cmds.txt
file is processed in separate thread, file is checked each second, checks are very fast hence don't occupy CPU, checks are just testing if file mofidy time has changed. Each new command should be added to file end on new line, processed lines should not be deleted. On program start commands file is cleaned.
Main thread just checks has_cmds
bool variable (if there are new commands) it is very fast and can be done very often, e.g. after processing tinyest task like 10-20 ms. There are no mutexes hence all works very fast.
For example of usage Main thread produces results of tasks processing at random time points and stores that results into array. On save command this results array is saved as JSON.
Program prints all information about what it does into console with time stamps included.
To test program do next:
- Start program. It starts processing computation work immediately.
- Open
cmds.txt
in any text editor. - Add new line with
save
string. Save file. - Program should print that
save
command was recognized, processed and work was saved to filesave.txt
. - Add another line in editor
save other.txt
. Save file - Program should print that it has save work to
save.txt
. - Add new line
exit
and save. - Program should exit.
- Try running program again.
- Try pressing
Ctrl+C
in program console. - Program should catch this keyboard interrupt and say about this, also work is saved to temporary file
save.txt.tmp
and program exits.
Of cause in simplest case just to save work on keyboard interrupt should be done like in this answer.
You can also implement process termination gracefully through handling SIGINT like in this solution. SIGINT can be send to program using this program windows-kill, by syntax windows-kill -SIGINT PID
, where PID
can be obtained by microsoft's pslist.
import threading, random, os, json, time, traceback
cmds = []
has_cmds = False
cmds_fname = 'cmds.txt'
save_fname = 'save.txt'
save_fname_tmp = 'save.txt.tmp'
def CurTimeStr(*, exact = False):
from datetime import datetime
return (datetime.now(), datetime.utcnow())[exact].strftime(('[%H:%M:%S]', '[%Y-%m-%d %H:%M:%S.%f UTC]')[exact])
def Print(*pargs, **nargs):
print(CurTimeStr(), *pargs, flush = True, **nargs)
def AddCmd(c, *, processed = False):
global cmds, has_cmds
cmds.append({**{'processed': threading.Event()}, **c})
if processed:
cmds[-1]['processed'].set()
has_cmds = True
return cmds[-1]
def ExternalCommandsThread():
global cmds, has_cmds
Print('Cmds thread started.')
first, next_line, mtime = True, 0, 0.
while True:
try:
if first:
Print(f'Cleaning cmds file "{cmds_fname}".')
with open(cmds_fname, 'wb') as f:
pass
first = False
if os.path.exists(cmds_fname) and abs(os.path.getmtime(cmds_fname) - mtime) > 0.0001 and os.path.getsize(cmds_fname) > 0:
Print(f'Updated cmds file "{cmds_fname}". Processing lines starting from {next_line + 1}.')
with open(cmds_fname, 'r', encoding = 'utf-8-sig') as f:
data = f.read()
lines = list(data.splitlines())
try:
mtime = os.path.getmtime(cmds_fname)
for iline, line in zip(range(next_line, len(lines)), lines[next_line:]):
line = line.strip()
if not line:
continue
if line[0] not in ['[', '{', '"']:
cmd = line.split()
else:
cmd = json.loads(line)
pargs = []
if type(cmd) is list:
cmd, *pargs = cmd
cmd = {'cmd': cmd, 'pargs': pargs}
assert 'cmd' in cmd, 'No "cmd" in command line!'
c = cmd['cmd']
if c in ['save']:
assert len(set(cmd.keys()) - {'cmd', 'fname', 'pargs'}) == 0
AddCmd({'cmd': 'save', 'fname': cmd.get('fname', (cmd['pargs'] or [save_fname])[0])})
elif c == 'exit':
AddCmd({'cmd': 'exit'})
else:
assert False, f'Unrecognized cmd "{c}"!'
Print(f'Parsed cmd "{c}" on line {iline + 1}.')
next_line = iline + 1
except (json.decoder.JSONDecodeError, AssertionError) as ex:
traceback.print_exc()
Print(f'Failed to parse cmds line {iline + 1} with text "{line}"!')
except:
raise
for i, c in enumerate(cmds):
if c is None:
continue
if not c['processed'].is_set():
has_cmds = True
while not c['processed'].wait(10):
Print(f'Timed out waiting for cmd "{c["cmd"]}" to be processed, continuing waiting!')
Print(f'Processed cmd "{c["cmd"]}".')
cmds[i] = None
if c['cmd'] == 'exit':
Print('Exit cmd. Cmds thread finishes.')
return
has_cmds = False
time.sleep(1)
except Exception as ex:
traceback.print_exc()
Print(f'Exception ^^^^^ in Cmds thread!')
AddCmd({'cmd': 'exit'})
time.sleep(3)
def Main():
global cmds, has_cmds
Print('Main thread started.')
threading.Thread(target = ExternalCommandsThread, daemon = False).start()
results = []
def SaveWork(fname):
with open(fname, 'w', encoding = 'utf-8') as f:
f.write(json.dumps(results, ensure_ascii = False, indent = 4))
Print(f'Work saved to "{fname}".')
def ProcessCmds():
# Returns False only if program should exit
for c in cmds:
if c is None or c['processed'].is_set():
continue
if c['cmd'] == 'save':
SaveWork(c['fname'])
elif c['cmd'] == 'exit':
Print('Exit cmd. Main thread finishes...')
c['processed'].set()
return False
else:
assert False, 'Unknown cmd "c["cmd"]"!'
c['processed'].set()
return True
try:
# Main loop of tasks processing
for i in range(1000):
for j in range(10):
if has_cmds and not ProcessCmds(): # Very fast check if there are any commands
return # Exit
# Emulate small work of 0-200 ms long.
time.sleep(random.random() * 0.2)
# Store results of work in array
results.append({'time': CurTimeStr(exact = True), 'i': i, 'j': j})
assert False, 'Main finished without exit cmd!'
except BaseException as ex:
traceback.print_exc()
Print(f'Exception ^^^^^ in Main thread!')
SaveWork(save_fname_tmp)
AddCmd({'cmd': 'exit'}, processed = True)
if __name__ == '__main__':
Main()
Example output 1:
[08:15:16] Main thread started.
[08:15:16] Cmds thread started.
[08:15:16] Cleaning cmds file "cmds.txt".
[08:15:21] Updated cmds file "cmds.txt". Processing lines starting from 1.
[08:15:21] Parsed cmd "save" on line 1.
[08:15:21] Work saved to "save.txt".
[08:15:21] Processed cmd "save".
[08:15:31] Updated cmds file "cmds.txt". Processing lines starting from 2.
[08:15:31] Parsed cmd "save" on line 2.
[08:15:31] Work saved to "other.txt".
[08:15:31] Processed cmd "save".
[08:15:35] Updated cmds file "cmds.txt". Processing lines starting from 3.
[08:15:35] Parsed cmd "exit" on line 3.
[08:15:35] Exit cmd. Main thread finishes...
[08:15:35] Processed cmd "exit".
[08:15:35] Exit cmd. Cmds thread finishes.
Commands file cmds.txt
corresponding to output above:
save
save other.txt
exit
Example output 2:
[08:14:39] Main thread started.
[08:14:39] Cmds thread started.
[08:14:39] Cleaning cmds file "cmds.txt".
Traceback (most recent call last):
File "stackoverflow_64165394_processing_commands_in_prog.py", line 127, in Main
time.sleep(random.random() * 0.2)
KeyboardInterrupt
[08:14:40] Exception ^^^^^ in Main thread!
[08:14:40] Work saved to "save.txt.tmp".
[08:14:41] Processed cmd "exit".
[08:14:41] Exit cmd. Cmds thread finishes.
Piece of example save.txt
:
[
{
"time": "[2020-10-02 05:15:16.836030 UTC]",
"i": 0,
"j": 0
},
{
"time": "[2020-10-02 05:15:16.917989 UTC]",
"i": 0,
"j": 1
},
{
"time": "[2020-10-02 05:15:17.011129 UTC]",
"i": 0,
"j": 2
},
{
"time": "[2020-10-02 05:15:17.156579 UTC]",
"i": 0,
"j": 3
},
................

- 14,883
- 6
- 36
- 69
-
Thanks. Doesn't every programmer who works with long programs have to do something similar to this? I couldn't find anything by googling. Why? Anyway, it's going to take a while for me to read this. I gave up on the treading module when I learned that it does not increase speed and could find no use for it. Also, have never used the traceback module. But again, thanks for your effort. – bobsmith76 Oct 03 '20 at 00:54
-
@bobsmith76 Instead of threading in Python almost always can be used multiprocessing, e.g. in my code instead of `threading.Thread(...).start()` you can almost without changes do [multiprocessing.Process(...).start()](https://docs.python.org/3.8/library/multiprocessing.html#process-and-exceptions). Threading is just simpler in general to use. And yes you're right, in general multiprocessing is faster in Python, because all threads use just one CPU core, but all Processes use different. – Arty Oct 03 '20 at 04:24
-
@bobsmith76 But because threads are easier to use, they still can be used in many situations: 1) When doing some very lightweight work like I do in my thread, just once in a while 2) Also threads share global/nonlocal variables, this can be needed, while Processes can share data only by serializing and sending it through [Manager](https://docs.python.org/3.8/library/multiprocessing.html#managers). 3) Threads may be samely as efficient as processes even for heavy work for the case when there's much of use of non-python code, e.g. some C++ library, or Input/Output on disk. – Arty Oct 03 '20 at 04:28
-
@bobsmith76 For the case of using C++ code described above when C++ function is invoked in python it can be multithreaded using all cores and be efficient, also [GIL](https://wiki.python.org/moin/GlobalInterpreterLock) is released in C++ functions, hence there is no slow down in them. – Arty Oct 03 '20 at 04:31
-
@bobsmith76 But in the case when you have some heavy computation definitely Processes should be used instead of threads. There where other my answers [like this one](https://stackoverflow.com/a/64080001/941531) about how to correctly use multi-processing and when. Probably something like I implemented above in my answer should have been already done in very qualitative way in some open source libraries, but they're not very popular hence same code is reimplemented many times by programmers like me. – Arty Oct 03 '20 at 04:40
-
I use forking but I'm pretty sure forking is the same as multiprocessing. I've also noticed I think that 2 forks gets the job done just as fast as 4. I've always wondered about this. – bobsmith76 Oct 03 '20 at 09:22