Okay, so to answer my own question.
So, as mentioned in the comments, I earlier posted an answer, then deleted it. My deleted answer was:
I figured it out: apparently the proc.stdout
stream object is automatically doing its own internal buffering, despite the bufsize = 0
argument passed to subprocess.Popen
. The stream object seems to automatically buffer data available for reading on the pipe's stdout
file descriptor behind the scenes.
So basically, I can't use os.read
to read directly from the underlying descriptor, because the proc.stdout
BufferedReader is automatically doing it's own buffering by reading from the underlying descriptor. To get this working as I want, I can simply call proc.stdout.read(READ_SIZE)
instead of os.read(fd, READ_SIZE)
directly, after poll()
indicates there is data to be read. That works as expected.
I deleted it because ultimately I realized this solution is not quite correct either. The problem is that even though it may work most of the time, there is no real guarantee this will work, because the call to poll()
will only return POLLIN
events when an actual low-level operating-system interrupt occurs indicating that data is available to read in the kernel buffer. But calling proc.stdout.read()
is not reading directly from the kernel buffer... it's reading from some internal Python buffer. So there is a mismatch between the POLLIN
event, and our decision to actually read. They are in fact, completely unrelated - so there is no guarantee our polling is working correctly, and thus no guarantee a call to proc.stdout.read()
would not block, or lose bytes.
But if we use os.read()
, there is no guarantee our call to os.read()
will always be able to read all bytes directly from the kernel buffer, because the Python BufferedReader
object is basically "fighting against us" to do it's own buffering. We are both fighting over the same underlying kernel buffer, and the Python BufferedReader
may sometimes extract bytes for its own buffering before we are able to extract those bytes via a call to os.read()
. In particular, I observe that if the child process exits or aborts unexpected, the Python BufferedReader
will immediately consume all remaining bytes from the kernel read buffer, (even if you set bufsize to 0) which is why I was losing part of the output of ls -lh
.
For anyone having trouble reproducing this problem, make sure that the child process you use outputs a significant amount of data, like at least around 15K.
So, what is the solution?
Solution 1:
I realized it is simply a non-starter to attempt to fight Python's own buffering facilities by trying to go around the Python buffering using my own low-level system calls. So, using the subprocess
module is essentially out. I reimplemented this using low level OS facilities directly, via the os
module. Basically, I did what is often done in C: create a pipe file descriptor using a call to os.pipe()
, then os.fork()
, then use os.dup()
to direct the read end of the pipe to the child process's sys.stdout.fileno()
descriptor. Finally, call one of the os.exec
functions in the child process to begin executing the actual subprocess.
Except even this isn't 100% correct. This works almost all of the time, unless you happen to create a child process that starts outputting huge amounts of bytes to sys.stdout.fileno()
. In this case you run into the problem of the OS pipe buffer, which has some limit (I think it's 65K on Linux). Once the OS pipe buffer fills up, it's possible that the process will hang because whatever library the child process is using to do I/O may also be doing it's own buffering.
In my case, the child process was using the C++ <ostream>
facilities to do I/O. This also does its own buffering, and so at some point when the pipe buffer filled up, the child process would simply hang. I never quite figured out exactly the reason. Presumably, it should hang if the pipe buffer is full - but I would have thought that if the parent process (which I control) calls os.read()
on the read end of the pipe, the child process could resume outputting. I suspect it's another issue with the child process doing it's own buffering. The C/C++ standard library output functions (like printf
in C, or std::cout
in C++) don't directly write to stdout
, but rather perform their own internal buffering. I suspect what happened is that the pipe buffer filled up, and so some call to printf
or std::cout
simply hung after being unable to flush the buffer completely.
So this brings me to...
Solution 2:
So it turns out that using pipes to do this is really fundamentally broken. Nobody seems to ever say this on thousands of tutorials, so perhaps I'm wrong, but I am claiming that using pipes to communicate with a child process is a fundamentally broken approach. There are simply too many things that can go wrong with all the various buffering going on at different levels. If you have complete control of the child process, you can always write directly to stdout
, using (in Python) something like os.write(1, mybuffer)
- but mostly you don't have control of the child process, and most programs will not write directly to stdout
, but rather will use some standard I/O facilities which have their own ways of buffering.
So, forget pipes. The real way to do this is to use pseudo-terminals. This may not be as portable, but it should work on most POSIX compliant platforms. A pseudo-terminal is basically a pipe-like I/O object that behaves like the standard console output descriptors, stdout
and stderr
. The important thing is that with a pseudo-terminal, the low level iocontrol
system call isatty
returns true
, and so standard I/O facilities like stdio.h
in C will treat the pipe like a line-buffered console.
In Python, you can create a pseudo-terminal using the pty
module. To create a subprocess, and then hook up it's stdout
to a pseudo-terminal in the parent process, you would do:
out_master, out_slave = pty.openpty()
os.set_inheritable(out_master, True)
os.set_inheritable(out_slave, True)
pid = os.fork()
if pid == 0: # child process
try:
assert(os.isatty(out_slave))
os.dup2(out_slave, sys.stdout.fileno())
os.close(out_master)
os.execlp(name_of_child_process, shell_command_to_execute_child_process)
except Exception as e:
os._exit(os.EX_OSERR)
else: # parent process
os.close(out_slave)
And now you can read from out_master
to get the output from whatever the child process writes to stdout
, and since you're using a pseudo-terminal, the child process will behave exactly as if it was outputting to a console - so it works perfectly with no buffering problems. You can of course do the exact same thing as above with stderr
as well.
Surprisingly, this solution is straightforward, yet I had to discover it myself because almost every tutorial or guide on the Internet that talks about communicating with a child process will insist you use pipes, which seems to be a fundamentally broken approach.
When `bufsize=0` python3.6 returns "raw" instance of PyFileIO_Type which seem to be unbuffered (but I may be wrong!) - see Modules/_io/_iomodule.c::_io_open_impl, iobase.c::_io__RawIOBase_read_impl, etc.
Both `os.read(fd, READ_SIZE)` and `proc.stdout.read(READ_SIZE)` works properly in my setup. – sqr163 Jun 27 '17 at 21:27