5

In python, How do I check the stdout from a subprocess.Popen object for anything to read? I'm writing a wrapper around a tool that sometimes runs for hours on-end. Using .readline() on the stdout from the child process is severely cutting in to the speed of the script when run for longer than a few minutes. I need a way to check the stdout more efficiently if there's anything to read. By the way, this particular tool only writes complete lines at a time. The script goes like this:

    #!/usr/bin/python -u
    #thiswrap.py

    import sys, time
    from subprocess import *

    chldp = Popen(sys.argv[1], bufsize=0, stdout=PIPE, close_fds=True)
    chstdin,chstdout=chldp.stdin,chldp.stdout
    startnoti=False

    while not chldp.poll():
        rrl=chstdout.readline() # <--- this is where the problem is
        if rrl[-8:]=='REDACTED TEXT':
            sys.stdout.write(rrl[:-1]+'   \r')
            if not startnoti: startnoti=True
        else:
            if startnoti: sys.stdout.write('\n')
            sys.stdout.write(rrl)
            if startnoti: # REDACTED
            time.sleep(0.1)
        time.sleep(0.1)

Any ideas?

  • 2
    Why is it a problem to let `readline` block? And why do you call `sleep`? – Vebjorn Ljosa Aug 08 '11 at 16:57
  • I'm going to ignore the troll part about readline blocking, and sleep is really just a stop-gap measure until I can get the readline stuff resolved. I know it's a bit lazy and clunky but I'm not going to need anything else in that part of the code unless it's something that might come from a better solution to know when to use readline() so it stays there until this problem goes away. – Matthew Jensen Aug 10 '11 at 18:46

3 Answers3

4

You need to set the file descriptors to be non-blocking, you can do this using fcntl:

import sys, time, fcntl, os
from subprocess import *

chldp = Popen(sys.argv[1], bufsize=0, stdout=PIPE, close_fds=True)
chstdin, chstdout = chldp.stdin, chldp.stdout
fl = fcntl.fcntl(chstdout, fcntl.F_GETFL)
fcntl.fcntl(chstdout, fcntl.F_SETFL, fl | os.O_NONBLOCK)

while chldp.poll() is not None:
    try:
        rrl = chstdout.readline()
    except IOError:
        time.sleep(0.1)
        continue
    # use rrl

When there is no data available an IOError will be raised by readline().

Note that since chldp.poll() could return 0 when the subprocess finishes, you should probably use childp.poll() is not None in your while rather than not childp.poll().

Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
  • Just a note: this doesn't work on Windows (I know the poster appear to be using Linux), see http://stackoverflow.com/questions/375427/non-blocking-read-on-a-subprocess-pipe-in-python – agf Aug 08 '11 at 21:57
  • This solution isn't working. It's causing the wrapper to not only miss lines but also throw up IOErrors anyway. Also the thing inevitably crashes within minutes of starting. (Btw, not linux, mac os 10.6 lol). Additionally, the child process won't ever quit of its own accord, so I'm keeping the while loop as is. – Matthew Jensen Aug 10 '11 at 18:40
1

Sadly there is no ready made way to poll for a condition "there's enough data in the pipe with line break so that readline() will return immediately".

If you want a line at a time, and don't want to block you can either:

Either implement your own buffering through a class or generator and poll through that, e.g.:

def linereader():
    data = ""
    while True:
        if poll(f.fd):
            data += f.read(100)
        lines = data.split("\n")
        data = lines[-1]
        for line in lines[:-1]:
            yield line

# use
for line in linereader():
    if line:
       print line
    else:
       time.sleep(...)

Or use threads (left as exercise to the reader, note that older versions of python bug if you start a subprocess from a thread other than main)

Dima Tisnek
  • 11,241
  • 4
  • 68
  • 120
0

The proposed solution in the first comment is almost correct. You just need to pass an integer file descriptor as the first argument to fcntl.fcntl, not the Python file object. Took from another answer

Here's the code to be changed:

chstdout = chldp.stdout
fd = chstdout.fileno()
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
krydev
  • 1