Python subprocess Log and Display in Shell Issues

Question

I have a python script where I'm running an external archive command with subprocess.Popen(). Then I'm piping stdout to a sys write and a log file (see code below), because I need to print and log the output. The external command outputs progress like "Writing Frame 1 of 1,000", which I would like in my log.

So far I can either have it display/write in large blocks by including "stdout=subprocess.PIPE, stderr=subprocess.PIPE", but then the user thinks the script isn't working. Or I just have "stdout=subprocess.PIPE" the progress "Writing of Frame..." aren't in the log file.

Any thoughts?

My script looks something like this:

archive_log = open('archive.log', 'w')
archive_log.write('Archive Begin')
process_archive = subprocess.Popen(["external_command", "-v", "-d"], stdout=subprocess.PIPE, stderr=subprocess.PIPE) #Archive Command

for line in process_archive.stdout:
    sys.stdout.write(line)
    archive_log.write(line)

archive_log.write('Archive End')
archive_log.close()

if you set *both* stdout and stderr to `PIPE` then you should read them *both* concurrently otherwise a deadlock is possible. You could use threads or asyncio for that, see [Python subprocess get children's output to file and terminal?](http://stackoverflow.com/q/4984428/4279) — jfs, May 19 '15 at 22:13

abarnert · Accepted Answer · 2015-05-18T21:30:18.813

It sounds like you're just trying to merge the subprocess's stdout and stderr into a single pipe. To do that, as the docs explain, you just pass stderr=subprocess.STDOUT.

If, on the other hand, you want to read from both pipes independently, without blocking on either one of them, then you need some explicit asynchronicity.

One way to do this is to just create two threads, one of them blocking on proc.stdout, the other on proc.stderr, then just have the main thread join both threads. (You probably want a lock inside the for body in each thread; that's the only way to make sure that lines are written atomically and in the same order on stdout and in the file.)

Alternatively, many reactor-type async I/O libraries, including the stdlib's own asyncio (if you're on 3.4+) and major third-party libs like Twisted can be used to multiplex multiple subprocess pipes.

Finally, at least if you're on Unix, if you understand all the details, you may be able to do it with just select or selectors. (If this doesn't make you say, "Aha, I know how to do it, I just have a question about one of the details", ignore this idea and use one of the other two.)

It's clear that you really do need stderr here. From your question:

Or I just have "stdout=subprocess.PIPE" the progress "Writing of Frame..." aren't in the log file.

That means the subprocess is writing those messages to stderr, not stdout. So when you don't capture stderr, it just passes through to the terminal, rather than being captured and written to both the terminal and the log by your code.

And it's clear that you really do need them either merged or handled asynchronously:

I can either have it display/write in large blocks by including "stdout=subprocess.PIPE, stderr=subprocess.PIPE", but then the user thinks the script isn't working.

The reason the user thinks the script isn't working is that, although you haven't shown us the code that does this, clearly you're looping on stdout and then on stderr. This means the progress messages won't show up until stdout is done, so the user will think the script isn't working.

I'm actually just trying to get stdout. I'm not sure why adding stderror works. Which I am finding out now it only sometimes works. Hmmmm — Elliot, May 18 '15 at 21:25
@Elliot: If the "Writing of Frame..." don't appear in the log file if you don't capture stderr, that means that they're being written to stderr. So you need to capture it. If the user thinks the program isn't working when you capture stderr and stdout separately, that means that you're reading all of stdout before any of stderr, or vice-versa, so you're blocking on output that isn't there. So you need to merge them, or capture them asynchronously. — abarnert, May 18 '15 at 21:26

score -1 · Answer 2 · answered May 18 '15 at 19:57

-1

Is there a reason you aren't using check_call and the syslog module to do this?

You might also want to use with like this:

with open('archive.log', 'w') as archive:`
  do stuff

You gain the benefit of the file being closed automatically.

answered May 18 '15 at 19:57

user193673

159
1
5

Using `check_call` doesn't allow you to capture and display progress messages as they appear; you only get them all at the end of the process, so, as the question says, "the user thinks the program isn't working". – abarnert May 18 '15 at 21:31
Maybe I'm not understanding then what the original poster is doing. When I use check_call I see in stdout the command returning as it runs. – user193673 May 19 '15 at 13:08
I'm guessing you're not passing a new pipe for stdout at all, so stdout is just the terminal, in which case of course you'll see the output as it happens. But then your program isn't capturing it and can't log it. You can get one or the other with the check_\* functions, but if you want both prompt output to the terminal and logging to a file, you can't. – abarnert May 19 '15 at 19:19

Python subprocess Log and Display in Shell Issues

2 Answers2