Can you make a python subprocess output stdout and stderr as usual, but also capture the output as a string?

Question

Possible Duplicate:
Wrap subprocess' stdout/stderr

In this question, hanan-n asked whether it was possible to have a python subprocess that outputs to stdout while also keeping the output in a string for later processing. The solution in this case was to loop over every output line and print them manually:

output = []
p = subprocess.Popen(["the", "command"], stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, ''):
    print(line)
    output.append(line)

However, this solution doesn't generalise to the case where you want to do this for both stdout and stderr, while satisfying the following:

the output from stdout/stderr should go to the parent process' stdout/stderr respectively
the output should be done in real time as much as possible (but I only need access to the strings at the end)
the order between stdout and stderr lines should not be changed (I'm not quite sure how that would even work if the subprocess flushes its stdout and stderr caches at different intervals; let's assume for now that we get everything in nice chunks that contain full lines?)

I looked through the subprocess documentation, but couldn't find anything that can achieve this. The closest I could find is to add stderr=subprocess.stdout and use the same solution as above, but then we lose the distinction between regular output and errors. Any ideas? I'm guessing the solution - if there is one - will involve having asynchronous reads to p.stdout and p.stderr.

Here is an example of what I would like to do:

p = subprocess.Popen(["the", "command"])
p.wait()  # while p runs, the command's stdout and stderr should behave as usual
p_stdout = p.stdout.read()  # unfortunately, this will return '' unless you use subprocess.PIPE
p_stderr = p.stderr.read()  # ditto
[do something with p_stdout and p_stderr]

I do not see how this technique (wrapping any IO functions and stream objects) does not generalize? — ninjagecko, Sep 04 '12 at 20:19
How do you suggest I get the subprocess to print to stdout and stderr in real time, while still getting the output in a string at the end, then? — pflaquerre, Sep 04 '12 at 20:21

Thomas Vander Stichele · Accepted Answer · 2012-09-05T17:57:01.423

38

This example seems to work for me:

# -*- Mode: Python -*-
# vi:si:et:sw=4:sts=4:ts=4

import subprocess
import sys
import select

p = subprocess.Popen(["find", "/proc"],
    stdout=subprocess.PIPE, stderr=subprocess.PIPE)

stdout = []
stderr = []

while True:
    reads = [p.stdout.fileno(), p.stderr.fileno()]
    ret = select.select(reads, [], [])

    for fd in ret[0]:
        if fd == p.stdout.fileno():
            read = p.stdout.readline()
            sys.stdout.write('stdout: ' + read)
            stdout.append(read)
        if fd == p.stderr.fileno():
            read = p.stderr.readline()
            sys.stderr.write('stderr: ' + read)
            stderr.append(read)

    if p.poll() != None:
        break

print 'program ended'

print 'stdout:', "".join(stdout)
print 'stderr:', "".join(stderr)

In general, any situation where you want to do stuff with multiple file descriptors at the same time and you don't know which one will have stuff for you to read, you should use select or something equivalent (like a Twisted reactor).

edited Sep 05 '12 at 17:57

answered Sep 04 '12 at 22:33

Thomas Vander Stichele

36,043
14
56
60

1

The order might not be *exactly* the same as the original one since `stdout` and `stderr` might both have data by the time we call `select`, but this is as close to a solution as I've seen so far. Thank you! – pflaquerre Sep 05 '12 at 13:09
What do you mean about the order? In the example I give, if there is data to read from both, it will handle stdout first because its fd normally will be lower than stderr's, and select should return it in order. If that's not the case or not guaranteed, you can simply change the code to do something like if p.stdout.fileno() in ret[0]: and then the same for stderr, that should guarantee the order. – Thomas Vander Stichele Sep 05 '12 at 13:44
I'm talking about the order in which the streams were written to. For example, the subprocess could write to stderr first, then write something else to stdout, but `select` would still read from stdout before stderr, which is not the same order as what actually happened. In the worst case, the subprocess might be writing so much to stdout that anything in stderr will get put off until the process is done running. (unless I misunderstood how select works?) – pflaquerre Sep 05 '12 at 14:14
1

Although it's unlikely, you're right that there is the potential for your scenario happening. I'm not convinced it's possible to guarantee that though. Note that typically stdout has line buffering enabled, while stderr does not. – Thomas Vander Stichele Sep 05 '12 at 17:55
8

it won't work on Windows. select() accepts sockets only on Windows – jfs Sep 05 '12 at 18:34
I am finding a cross platform approach for this kind of problem. Having read this, https://docs.python.org/2/library/select.html, it seems like as @J.F.Sebastian said, it won't work on Windows. Any proven alternative? – swdev Sep 16 '14 at 20:57
1

@swdev: [my answer that uses multiple threads works](http://stackoverflow.com/a/12287578/4279) Ignore @KomodoDave's comment: he was confused that I've used `call` as a name, I've renamed it to `teed_call` to make it crystal clear that it is not `subprocess.call` and therefore it accepts arbitrary file-like objects. You could also do it in a single thread (async. io) e.g., [Displaying subprocess output to stdout and redirecting it](http://stackoverflow.com/q/25750468/4279) – jfs Sep 17 '14 at 01:09
4

There is a problem with this answer. If more data becomes readable in the streams between the call to select() and the call to poll(), that data will never be printed. – orodbhen Nov 10 '15 at 14:33
2

It works most of the time, but I noticed that sometimes even when `p.poll() is None`, not all stdout/stderr content has been written to the screen. I added something like `for line in proc.stdout/stderr.readlines(): print(line)` just before the break, and it appears to work. – zyxue Mar 06 '16 at 04:44
This code of Thomas doesn't work fine if you replace "find /proc" with "ls -l" (Ubuntu 14.10). From time to time it gives empty output or just 1-3 lines of "ls -l" output. – kinORnirvana Apr 27 '16 at 16:42
1

WARNING! readline() is blocking. This example gives a good idea to start but it's incomplete to be usable in prodoction. See a more full Adam Rosenfeld version here: http://stackoverflow.com/questions/7729336/how-can-i-print-and-display-subprocess-stdout-and-stderr-output-without-distorti/7730201#7730201 – kinORnirvana Apr 27 '16 at 17:12
`p.poll()` does not ensure you read while data. If processing inside `for` loop is slow, there is a race condition in this code. – Jérôme Pouiller Feb 21 '19 at 18:12

jfs · Answer 2 · 2018-06-30T14:52:42.293

12

To print to console and capture in a string stdout/stderr of a subprocess in a portable manner:

from StringIO import StringIO

fout, ferr = StringIO(), StringIO()
exitcode = teed_call(["the", "command"], stdout=fout, stderr=ferr)
stdout = fout.getvalue()
stderr = ferr.getvalue()

where teed_call() is defined in Python subprocess get children's output to file and terminal?

You could use any file-like objects (.write() method).

edited Jun 30 '18 at 14:52

answered Sep 05 '12 at 18:25

jfs

399,953
195
994
1,670

Poster wants to process data from stdout/stderr as it comes in, not at the end of the command. – Thomas Vander Stichele Sep 06 '12 at 08:57
@ThomasVanderStichele: click the link and look the `call()` definition. It displays stdout/stderr as soon as possible. It is not `subprocess.call()`. – jfs Sep 06 '12 at 16:11
7

This doesn't work, at least not always: `AttributeError: StringIO instance has no attribute 'fileno'` – KomodoDave Aug 21 '13 at 10:13
2

@KomodoDave: Are you sure that you use [`teed_call()` defined in the provided link](http://stackoverflow.com/a/4985080/4279) and *not* `subprocess.call()`? I've renamed `call()` to `teed_call()` to avoid the confusion (just in case). – jfs Aug 21 '13 at 11:02
You've just edited your post to use `teed_call` since I commented - unsurprisingly no I didn't use it before. Thank you for the fix. – KomodoDave Aug 21 '13 at 11:30
@KomodoDave: yes, *all* I did is renamed [`call()` function](http://stackoverflow.com/revisions/4985080/2) (note: it is not `subprocess.call()`) to [`teed_call()`](http://stackoverflow.com/revisions/4985080/3). Follow the links to see that the function definition hasn't changed between the revisions. – jfs Aug 21 '13 at 12:07
@J.F.Sebastian I understand now, apologies and thank you. – KomodoDave Aug 21 '13 at 14:53
2

@KomodoDave Could you remove you misleading comment? The code does work. – jfs Mar 12 '18 at 08:31

score 2 · Answer 3 · answered Sep 04 '12 at 20:27

2

Create two readers as above, one for stdout one for stderr and start each in a new thread. This would append to the list in roughly the same order they were output by the process. Maintain two separate lists if you want.

i.e.,

p = subprocess.Popen(["the", "command"])
t1 = thread.start_new_thread(func,stdout)  # create a function with the readers
t2 = thread.start_new_thread(func,stderr)
p.wait() 
# your logic here

answered Sep 04 '12 at 20:27

dfb

13,133
2
31
52

Using time to sort threads is a bad idea. – Daniel Sep 04 '12 at 20:30
@Dani - Those were two distinct ideas, the time one doesn't involve threading... – dfb Sep 04 '12 at 20:32
They say you can have a dead-lock with both wait() and poll() in combination with PIPE of stdout (and stderr). – Melroy van den Berg Nov 26 '14 at 12:57

Can you make a python subprocess output stdout and stderr as usual, but also capture the output as a string?

3 Answers3

Linked