Low level select.poll() to read from subprocess

Question

I'm using the low-level POSIX facilities in the select and os modules to read data from a pipe connected to a running shell process. To avoid blocking indefinitely, I set the stdout file descriptor of the piped process to non-blocking mode using the fcntl module, and then use select.poll to poll the file descriptor until data is available for reading. Once data is available, I use os.read() to read some data from the pipe, and then keep looping until os.read() returns an empty bytes object or some error occurs.

I have it working, except for some reason the data I end up reading from the pipe gets truncated. I read about half of the expected output of the piped process, and then os.read() returns an empty bytes object. I can't figure out why I lose the remainder of the data.

Basically, I have a run_poll_once() function that runs a single call to the poll object's poll() method. The function returns True if we should keep polling for more data, and False if we should stop. The function is as follows (with error checking removed and edited for clarity and relevance):

def run_poll_once(poll):
    events = poll.poll(0.10)
    for fd, event in events:
        if event & select.POLLERR:
            return False

        if (event & select.POLLIN) or (event & select.POLLHUP):
            data = os.read(fd, READ_SIZE)
            print("Read:", data)
            if len(data) == 0: return False
            # ... do stuff with data

    return True

I then call this function like:

with subprocess.Popen(
        ["ls", "-lh"], 
        stdin = None, 
        stdout = subprocess.PIPE, 
        bufsize = 0
    ) as proc:

    # --- snip setting proc.stdout.fileno() to non-blocking mode
    poll = select.poll()
    event_mask = select.POLLIN | select.POLLERR | select.POLLHUP
    poll.register(proc.stdout.fileno(), event_mask)

    while run_poll_once(poll):
        pass

So this gets me about half the expected output from the piped process (ls -lh), and then os.read() prematurely returns an empty bytes object. So what am I doing wrong here?

I'm unable to reproduce the described issue, everything looks good and works. But still there is [another solution described here](https://stackoverflow.com/questions/375427/non-blocking-read-on-a-subprocess-pipe-in-python) which may be helpful — sqr163, Jun 23 '17 at 07:37
@Siler I see you answered and then deleted your answer. Your answer seems good to me; why delete it? May I recommend that you undelete it, so that others who are interested in your question (and don't have a rep of 10k+) can see your solution? You can even accept your own answer; that's ok to do as long as the answer solves your problem. — Scott Mermelstein, Jun 23 '17 at 15:43
@Siler pls, share your research. I'm very interested to see why this code may not work. — sqr163, Jun 24 '17 at 08:27
Fyi, there's a discussion about this on [meta](https://meta.stackoverflow.com/questions/351277/user-answered-their-own-question-then-deleted-their-seemingly-valid-answer) — Scott Mermelstein, Jun 26 '17 at 16:12
@ScottMermelstein Thank You very much! Before posting my first comment here I spent half day browsing kernel source code (fs/poll.c) and c-python (Modules/posixmodule.c, _io, etc.) in order to find answer why this piece of code may not work - and didn't find anything. It must work. — sqr163, Jun 26 '17 at 19:11
@Siler I've just read the answer you deleted (thank you Scott) - and it is perfectly matching with the thoughts after my research: the code must work, and of course when `bufsize = 0` is regarded. — sqr163, Jun 26 '17 at 19:20
@sqrt163 As you probably saw from meta, if _you're_ confident in their answer, feel free to resurrect it as a CW - especially if you can write it from your own perspective. I was only watching this out of related interests, and don't have enough confidence to post it myself. — Scott Mermelstein, Jun 26 '17 at 19:45
I'd love to post an answer, but still not able neither to reproduce nor find possible reason for the described behaviour in source codes (kernel 3.8 fs/pipe.c and python 3.6 Modules/_io/fileio.c::_Py_read, etc.).
When `bufsize=0` python3.6 returns "raw" instance of PyFileIO_Type which seem to be unbuffered (but I may be wrong!) - see Modules/_io/_iomodule.c::_io_open_impl, iobase.c::_io__RawIOBase_read_impl, etc.
Both `os.read(fd, READ_SIZE)` and `proc.stdout.read(READ_SIZE)` works properly in my setup. — sqr163, Jun 27 '17 at 21:27

Siler · Answer 1 · 2017-06-30T05:29:35.310

Okay, so to answer my own question.

So, as mentioned in the comments, I earlier posted an answer, then deleted it. My deleted answer was:

I figured it out: apparently the proc.stdout stream object is automatically doing its own internal buffering, despite the bufsize = 0 argument passed to subprocess.Popen. The stream object seems to automatically buffer data available for reading on the pipe's stdout file descriptor behind the scenes.

So basically, I can't use os.read to read directly from the underlying descriptor, because the proc.stdout BufferedReader is automatically doing it's own buffering by reading from the underlying descriptor. To get this working as I want, I can simply call proc.stdout.read(READ_SIZE) instead of os.read(fd, READ_SIZE) directly, after poll() indicates there is data to be read. That works as expected.

I deleted it because ultimately I realized this solution is not quite correct either. The problem is that even though it may work most of the time, there is no real guarantee this will work, because the call to poll() will only return POLLIN events when an actual low-level operating-system interrupt occurs indicating that data is available to read in the kernel buffer. But calling proc.stdout.read() is not reading directly from the kernel buffer... it's reading from some internal Python buffer. So there is a mismatch between the POLLIN event, and our decision to actually read. They are in fact, completely unrelated - so there is no guarantee our polling is working correctly, and thus no guarantee a call to proc.stdout.read() would not block, or lose bytes.

But if we use os.read(), there is no guarantee our call to os.read() will always be able to read all bytes directly from the kernel buffer, because the Python BufferedReader object is basically "fighting against us" to do it's own buffering. We are both fighting over the same underlying kernel buffer, and the Python BufferedReader may sometimes extract bytes for its own buffering before we are able to extract those bytes via a call to os.read(). In particular, I observe that if the child process exits or aborts unexpected, the Python BufferedReader will immediately consume all remaining bytes from the kernel read buffer, (even if you set bufsize to 0) which is why I was losing part of the output of ls -lh.

For anyone having trouble reproducing this problem, make sure that the child process you use outputs a significant amount of data, like at least around 15K.

So, what is the solution?

Solution 1:

I realized it is simply a non-starter to attempt to fight Python's own buffering facilities by trying to go around the Python buffering using my own low-level system calls. So, using the subprocess module is essentially out. I reimplemented this using low level OS facilities directly, via the os module. Basically, I did what is often done in C: create a pipe file descriptor using a call to os.pipe(), then os.fork(), then use os.dup() to direct the read end of the pipe to the child process's sys.stdout.fileno() descriptor. Finally, call one of the os.exec functions in the child process to begin executing the actual subprocess.

Except even this isn't 100% correct. This works almost all of the time, unless you happen to create a child process that starts outputting huge amounts of bytes to sys.stdout.fileno(). In this case you run into the problem of the OS pipe buffer, which has some limit (I think it's 65K on Linux). Once the OS pipe buffer fills up, it's possible that the process will hang because whatever library the child process is using to do I/O may also be doing it's own buffering.

In my case, the child process was using the C++ <ostream> facilities to do I/O. This also does its own buffering, and so at some point when the pipe buffer filled up, the child process would simply hang. I never quite figured out exactly the reason. Presumably, it should hang if the pipe buffer is full - but I would have thought that if the parent process (which I control) calls os.read() on the read end of the pipe, the child process could resume outputting. I suspect it's another issue with the child process doing it's own buffering. The C/C++ standard library output functions (like printf in C, or std::cout in C++) don't directly write to stdout, but rather perform their own internal buffering. I suspect what happened is that the pipe buffer filled up, and so some call to printf or std::cout simply hung after being unable to flush the buffer completely.

So this brings me to...

Solution 2:

So it turns out that using pipes to do this is really fundamentally broken. Nobody seems to ever say this on thousands of tutorials, so perhaps I'm wrong, but I am claiming that using pipes to communicate with a child process is a fundamentally broken approach. There are simply too many things that can go wrong with all the various buffering going on at different levels. If you have complete control of the child process, you can always write directly to stdout, using (in Python) something like os.write(1, mybuffer) - but mostly you don't have control of the child process, and most programs will not write directly to stdout, but rather will use some standard I/O facilities which have their own ways of buffering.

So, forget pipes. The real way to do this is to use pseudo-terminals. This may not be as portable, but it should work on most POSIX compliant platforms. A pseudo-terminal is basically a pipe-like I/O object that behaves like the standard console output descriptors, stdout and stderr. The important thing is that with a pseudo-terminal, the low level iocontrol system call isatty returns true, and so standard I/O facilities like stdio.h in C will treat the pipe like a line-buffered console.

In Python, you can create a pseudo-terminal using the pty module. To create a subprocess, and then hook up it's stdout to a pseudo-terminal in the parent process, you would do:

out_master, out_slave = pty.openpty()
os.set_inheritable(out_master, True)
os.set_inheritable(out_slave, True)

pid = os.fork()

if pid == 0: # child process
  try:
    assert(os.isatty(out_slave))
    os.dup2(out_slave, sys.stdout.fileno())
    os.close(out_master)
    os.execlp(name_of_child_process, shell_command_to_execute_child_process)
  except Exception as e:
    os._exit(os.EX_OSERR)
else: # parent process
  os.close(out_slave)

And now you can read from out_master to get the output from whatever the child process writes to stdout, and since you're using a pseudo-terminal, the child process will behave exactly as if it was outputting to a console - so it works perfectly with no buffering problems. You can of course do the exact same thing as above with stderr as well.

Surprisingly, this solution is straightforward, yet I had to discover it myself because almost every tutorial or guide on the Internet that talks about communicating with a child process will insist you use pipes, which seems to be a fundamentally broken approach.

Thanks for sharing the details. It sounds like a stretch to say that pipes don't work due to buffering; there should have been some way to do this without buffering in the way. I wonder if you could have just used `write` on the other side, instead of `printf` or `cout`? In any case, you've now shared a solution that is apparently unique on the internet. I'd never played with pseudo-terminals before, so it's good to know it's available. — Scott Mermelstein, Jun 29 '17 at 19:14
You can definitely use `write` on the other side (the child process side). That would work fine as well. Except that assumes you have full control of the child process. In the event the child process is some arbitrary program you didn't write, you have no control over how the child process outputs to `stdout`. And of course, most programs will not use `write` directly, but will use standard buffered I/O facilities. — Siler, Jun 29 '17 at 19:46

Low level select.poll() to read from subprocess

1 Answers1