How do I properly handle Python 3's Popen.communicate() timeout?

Question

I find the documentation on Popen.communicate() somewhat incomplete. Take the example from the documentation, with a slight modification:

p = subprocess.Popen(cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE)
try :                                                                       
    outs, errs = p.communicate(ins, timeout=5)                             
except TimeoutExpired:                                                      
    p.kill()                                                
    outs, errs = p.communicate(ins) # What's with the input data?                                         
# What value does p.returncode have at this point?

Then I have two questions:

If I send input ins to the child process, do I resend the input after catching the timeout exception? Does this handle already read input correctly?
What will be the value (if any) of p.returncode after calling p.kill(), i.e. after sending a SIGKILL to the process?

score 0 · Answer 1 · edited Dec 28 '15 at 23:28

0

After killing the process, it makes no sense to send it input. The final p.communicate reads data in the pipes.
Indeterminate. The process might exit on its own between the timeout and p.kill in the exception handler, with whatever exit code. Otherwise, the p.returncode doc says

"A negative value -N indicates that the child was terminated by signal N (POSIX only)."

So -SIGKILL on POSIX. On Windows? If that matters to you, try it and see.

edited Dec 28 '15 at 23:28

Jens

8,423
9
58
78

answered Dec 28 '15 at 21:29

Terry Jan Reedy

18,414
3
40
52

Thanks, that makes sense. I'm actually tempted to skip `communicate` after the `kill` altogether because the data is quite likely incomplete anyway—at least in my case. (Although on POSIX it seems I ought to be able to check the `returncode` and see if maybe the process finished before the `kill`.) I just added a link to the *other* `Popen.returncode` doc with your citation; I didn't see that first time 'round. – Jens Dec 28 '15 at 23:24
@Jens: `.returncode` is None until you've called `.wait()` or `.communicate()` and they returned successfully (or `.poll()` returned a non-None value). The second `.communicate()` call should drop `ins` parameter i.e., use `p.communicate()`, not `p.communicate(ins)`. Use `p.communicate()` even if you don't need the buffered output, to close the pipes. – jfs Dec 29 '15 at 00:58

score 0 · Answer 2 · edited May 23 '17 at 12:15

0

I guess I could have saved me from posting this question by reading the subprocess code :) Note that the module implements some internals differently for POSIX and Windows, and at this point I care only for the POSIX implementation.

So here is my take on all this which largely verifies what has been answered already.

Passing ins a second time to communicate() after killing the child process upon timeout will raise a ValueError exception stating that Cannot send input after starting communication (see here).
Note that Terry's answer covers this. The returncode is initially set to None.
- If the child process terminated between raising the timeout exception and sending the kill signal, then kill() won't send a signal (see here) and the second call to communicate() will simply gather the output data and returncode contains a proper value.
- Otherwise the second call to communicate() will wait() until the child process is dead, set the return code value to -SIGKILL (see here) as part of the wait, and then gather the output data.
Which means that I can use p.returncode after the try-except block in the question, and it will tell me if the child process terminated correctly or not.

edited May 23 '17 at 12:15

Community

1
1

answered Dec 29 '15 at 06:37

Jens

8,423
9
58
78

[I have to repeat](http://stackoverflow.com/questions/34489341/how-do-i-properly-handle-python-3s-popen-communicate-timeout#comment56747915_34501037): `.returncode` does NOT update itself magically. You have to call `.wait()`, `.communicate()` or `.poll()`, to update it. Therefore if `.communicate()` raises `TimeoutError` then `.returncode` is `None` **always** -- no race condition. You should call `.wait()` to set `.returncode` after the `.kill()` call if you haven't redirected stdin/stdout/stderr, otherwise call `.communicate()` to close the pipes and set `.returncode`. – jfs Dec 31 '15 at 00:27
Actually, `.communicate()` locks up reproducibly after a `kill()` in my scenario, and I could trace it down to the `select(timeout=None)` call [here](https://hg.python.org/releasing/3.4.4/file/tip/Lib/selectors.py#l314). Only the output was piped and registered for that SelectSelector. Not quite sure what's going on exactly here... yet. – Jens Jan 14 '16 at 05:16
it means that the subprocess spawn its own child processes that inherit the pipes and survive `.kill()` (their parent is killed by they go on). How to kill an arbitrary family of processes is application dependent. If children do not change their process group then use: `preexec_fn=os.setpgrp` and `os.killpg(p.pid, signal.SIGINT)` to kill the whole process group. If you don't care about the surviving descendants then close pipes manually (`pipe.close()`) and call `p.wait()` after `p.kill()`. – jfs Jan 14 '16 at 06:47
You refer to [this solution](http://stackoverflow.com/questions/4789837/how-to-terminate-a-python-subprocess-launched-with-shell-true/4791612#4791612), correct? – Jens Jan 14 '16 at 07:52
yes, it is similar. Though notice that the details differ. – jfs Jan 14 '16 at 07:58

How do I properly handle Python 3's Popen.communicate() timeout?

2 Answers2