30

I'm writing a python script that uses subprocess.Popen to execute two programs (from compiled C code) which each produce stdout. The script gets that output and saves it to a file. Because the output is sometimes large enough to overwhelm subprocess.PIPE, causing the script to hang, I send the stdout directly to the log file. I want to have my script write something to the beginning and end of the file, and between the two subprocess.Popen calls. However, when I look at my log file, anything I wrote to the log file from the script is all together at the top of the file, followed by all the executable stdout. How can I interleave my added text to the file?

def run(cmd, logfile):
    p = subprocess.Popen(cmd, shell=True, universal_newlines=True, stdout=logfile)
    return p

def runTest(path, flags, name):
    log = open(name, "w")
    print >> log, "Calling executable A"
    a_ret = run(path + "executable_a_name" + flags, log)
    print >> log, "Calling executable B"
    b_ret = run(path + "executable_b_name" + flags, log)
    print >> log, "More stuff"
    log.close()

The log file has: Calling executable A Calling executable B More stuff [... stdout from both executables ...]

Is there a way I can flush A's stdout to the log after calling Popen, for example? One more thing that might be relevant: Executable A starts then pends on B, and after B prints stuff and finishes, A then prints more stuff and finishes.

I'm using Python 2.4 on RHE Linux.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
jasper77
  • 1,553
  • 5
  • 19
  • 31
  • When I used stdout=subprocess.PIPE and had the outer loop write everything to the log file, I was able to interleave my own text with the executables' output. When I add no text, the log has content in this order: 1)A output 2) B output 3) rest of A output. I could add text before or after each of those steps. Now I can only add text at the start or end of the log. Adding a wait() after the Popen hangs the script because B wouldn't start until A finishes, which it won't, because A waits for handshaking from B. Is it possible to interleave my own text in the log with this approach? – jasper77 Jul 07 '10 at 17:25

4 Answers4

28

You could call .wait() on each Popen object in order to be sure that it's finished and then call log.flush(). Maybe something like this:

def run(cmd, logfile):
    p = subprocess.Popen(cmd, shell=True, universal_newlines=True, stdout=logfile)
    ret_code = p.wait()
    logfile.flush()
    return ret_code

If you need to interact with the Popen object in your outer function you could move the .wait() call to there instead.

Benno
  • 5,640
  • 2
  • 26
  • 31
  • 1
    If I put a wait() in the run() function, then executable B won't start running until A finishes, and since A doesn't finish until B does, the script would hang. However, I found that if I have runTest(), the outer function, run A then B, then wait on A and flush the log, a line I print at the end of runTest actually shows up at the end of the log file. I still have not found a way to print text to the file just before B is run. I don't know that there is a way. – jasper77 Jul 07 '10 at 14:40
  • 2
    `logfile.flush()` has no effect on child processes. – jfs Nov 22 '12 at 13:59
  • 1
    `shell=True` is not recommended – alper Apr 30 '20 at 11:47
3

You need to wait until the process is finished before you continue. I've also converted the code to use a context manager, which is cleaner.

def run(cmd, logfile):
    p = subprocess.Popen(cmd, shell=True, universal_newlines=True, stdout=logfile)
    p.wait()
    return p

def runTest(path, flags, name):
    with open(name, "w") as log:
        print >> log, "Calling executable A"
        a_ret = run(path + "executable_a_name" + flags, log)
        print >> log, "Calling executable B"
        b_ret = run(path + "executable_b_name" + flags, log)
        print >> log, "More stuff"
Chris B.
  • 85,731
  • 25
  • 98
  • 139
  • Context managers are a python2.6 feature that isn't available for anyone still running RHEL5 systems. Until RHEL6 comes out, it's best to not use them. – Jerub Jul 07 '10 at 05:25
  • You can use context managers in Python 2.5 by using `from __future__ import with_statement` before any other imports. – detly Jul 08 '10 at 04:32
  • But it looks like RHEL5 is stuck on Python 2.4. – Chris B. Jul 08 '10 at 14:46
2

As I understand it A program waits for B to do its thing and A exits only after B exits.

If B can start without A running then you could start the processes in the reverse order:

from os.path import join as pjoin
from subprocess import Popen

def run_async(cmd, logfile):
    print >>log, "calling", cmd
    p = Popen(cmd, stdout=logfile)
    print >>log, "started", cmd
    return p

def runTest(path, flags, name):
    log = open(name, "w", 1)  # line-buffered
    print >>log, 'calling both processes'
    pb = run_async([pjoin(path, "executable_b_name")] + flags.split(), log)
    pa = run_async([pjoin(path, "executable_a_name")] + flags.split(), log)
    print >>log, 'started both processes'
    pb.wait()
    print >>log, 'process B ended'
    pa.wait()
    print >>log, 'process A ended'
    log.close()

Note: calling log.flush() in the main processes has no effect on the file buffers in the child processes.

If child processes use block-buffering for stdout then you could try to force them to flush sooner using pexpect, pty, or stdbuf (it assumes that the processes use line-buffering if run interactively or they use C stdio library for I/O).

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
2

I say just keep it real simple. Pseudo code basic logic:

write your start messages to logA
execute A with output to logA
write your in-between messages to logB
execute B with output to logB
write your final messages to logB
when A & B finish, write content of logB to the end of logA
delete logB
Peter Lyons
  • 142,938
  • 30
  • 279
  • 274
  • Thank you for the thinking-outside-the-box suggestion to use two separate log files for A and B instead of a single log file. I'll have to think about that. – jasper77 Jul 08 '10 at 16:39