Continuously process large amounts of stdout and stderr from a child process

Question

There are a lot of good answers on Stack Overflow about how to handle output with subprocesses, async IO, and avoiding deadlock with PIPE. Something is just not sinking in for me though; I need some guidance on how to accomplish the following.

I want to run a subprocess from my python program. The subprocess generates a ton of standard output, and a little bit of standard error if things go bad. The subprocess itself takes about 20 minutes to complete. For the output and error generated, I want to be able to both log it to the terminal, and write it to a log file.

Doing the latter was easy. I just opened two files and set then as stdout and stderr on the Popen object. However, also capturing the output as lines so that I may print them continuously to terminal has me vexed. I was thinking I could use the poll() method to continuously poll. With this though, I'd still need to use PIPE for stdout and stderr, and call read() on them which would block until EOF.

I think what I'm trying to accomplish is this:

start the subprocess
    while process is still running
        if there are any lines from stdout
           print them and write them to the out log file
        if there are any lines from stderr
           print them and write them to the err log file
        sleep for a little bit

Does that seem reasonable? If so, can someone explain how one would implement the 'if' parts here without blocking.

Thanks

other questions on the subject suggest there is no non-blocking read available? That means you would have to use an async or a select - otherwise read will block until EOF. make sure you are flushing sys.stdout in the subprocess. print(flush=True) does not seem to work for me. — user3467349, Feb 12 '15 at 01:01
async.io: [Subprocess.Popen: cloning stdout and stderr both to terminal and variables](http://stackoverflow.com/a/25960956/4279) — jfs, Feb 12 '15 at 02:02
multithreaded: [Python subprocess get children's output to file and terminal?](http://stackoverflow.com/a/4985080/4279) — jfs, Feb 12 '15 at 02:03
see also, [Displaying subprocess output to stdout and redirecting it](http://stackoverflow.com/q/25750468/4279). Is it enough? — jfs, Feb 12 '15 at 02:04
Apparently my searching skills need some work. The async.io example is just what I needed. Thanks. — D.C., Feb 14 '15 at 02:03
possible duplicate of [Can you make a python subprocess output stdout and stderr as usual, but also capture the output as a string?](http://stackoverflow.com/questions/12270645/can-you-make-a-python-subprocess-output-stdout-and-stderr-as-usual-but-also-cap) — D.C., Feb 19 '15 at 00:47

score 2 · Accepted Answer · answered Feb 12 '15 at 01:06

2

Here is my select.select version:

Subprocess (foo.py):

import time
import sys

def foo(): 
    for i in range(5): 
        print("foo %s" %i, file=sys.stdout, )#flush=True
        sys.stdout.flush()
        time.sleep(7)
foo()

Main:

import subprocess as sp
import select
proc= sp.Popen(["python", "foo.py"], stderr=sp.PIPE, stdout=sp.PIPE)
last_line = "content"
while last_line: 
    buff = select.select([proc.stdout], [], [], 60)[0][0]
    if not buff:
        print('timed out') 
        break 
    last_line = buff.readline() 
    print(last_line)

answered Feb 12 '15 at 01:06

user3467349

3,043
4
34
61

1

The code should read from `proc.stderr` too as OP asks. `readline()` may block after `select()`, use `os.read()` instead. "timed out" does not mean that all output is read; don't break the loop prematurely. Write output to a file too as OP asks. `select()` with pipes won't work on Windows. Follow the links in the comments above, to see portable solutions. – jfs Feb 12 '15 at 13:46
@J.F.Sebastian It's trivial to add proc.stderr to the select call or to remove the timeout if he doesn't need it. How exactly can readline block after select? – user3467349 Feb 12 '15 at 19:25
@user3467349: if it is trivial for you then make the appropriate changes. It is very easy to introduce a bug in such code. Why do you think `select()` returns only when a full line is ready? – jfs Feb 12 '15 at 19:27
Because `print` always writes a full line? I'll add `os.read` as an alternative. – user3467349 Feb 12 '15 at 19:35

Continuously process large amounts of stdout and stderr from a child process

1 Answers1