You could redirect stderr to stdout:
from subprocess import Popen, PIPE, STDOUT
proc = Popen(['./mr-task.sh'], stdout=PIPE, stderr=STDOUT, bufsize=1)
for line in iter(proc.stdout.readline, b''):
print line,
proc.stdout.close()
proc.wait()
See Python: read streaming input from subprocess.communicate().
in my real program I redirect stderr to stdout and read from stdout, so bufsize is not needed, is it?
The redirection of stderr to stdout and bufsize
are unrelated. Changing bufsize
might affect the time performance (the default bufsize=0 i.e., unbuffered on Python 2). Unbuffered I/O might be 10..100 times slower. As usual, you should measure the time performance if it is important.
Calling Popen.wait/communicate after the subprocess has terminated is just for clearing zombie process, and these two methods have no difference in such case, correct?
The difference is that proc.communicate()
closes the pipes before reaping the child process. It releases file descriptors (a finite resource) to be used by a other files in your program.
about buffer, if output fill buffer maxsize, will subprocess hang? Does that mean if I use the default bufsize=0 setting I need to read from stdout as soon as possible so that subprocess don't block?
No. It is a different buffer. bufsize
controls the buffer inside the parent that is filled/drained when you call .readline()
method. There won't be a deadlock whatever bufsize
is.
The code (as written above) won't deadlock no matter how much output the child might produce.
The code in @falsetru's answer can deadlock because it creates two pipes (stdout=PIPE, stderr=PIPE
) but it reads only from one pipe (proc.stderr
).
There are several buffers between the child and the parent e.g., C stdio's stdout buffer (a libc buffer inside child process, inaccessible from the parent), child's stdout OS pipe buffer (inside kernel, the parent process may read the data from here). These buffers are fixed they won't grow if you put more data into them. If stdio's buffer overflows (e.g., during a printf()
call) then the data is pushed downstream into the child's stdout OS pipe buffer. If nobody reads from the pipe then then this OS pipe buffer fills up and the child blocks (e.g., on write()
system call) trying to flush the data.
To be concrete, I've assumed C stdio's based program and POSIXy OS.
The deadlock happens because the parent tries to read from the stderr pipe that is empty because the child is busy trying to flush its stdout. Thus both processes hang.