1

I have a program written in Python that at some point creates a subprocess and then has to get its std output in "real time" through a file (the process takes a while and some output is needed while it is still running). The problem is that, sometimes, a chunk of the output is lost (actually a chunk starting at the begining). The code looks like this:

import subprocess
import tempfile
import time
..
tf_out = tempfile.TemporaryFile()
tf_err = tempfile.TemporaryFile()
tf_in = tempfile.TemporaryFile()

tf_in.write(some_stdin)
tf_in.seek(0)

# create the process
process = subprocess.Popen(command, shell=True,
            stdin=tf_in,
            stdout=tf_out,
            stderr=tf_err)

running = True
while running:
    # last iteration if process ended
    if process.poll() is not None:
        running = False

    # read output
    tf_out.seek(0)          # 1
    stdout = tf_out.read()  # 2

    # do something with the partial output
    do_something(stdout)

    time.sleep(0.1)

tf_out.close()
tf_err.close()
tf_in.close()
..

I wonder if it is possible that the problem may be between # 1 and # 2, in case that: the subprocess write something, then seek to zero at # 1, another write starting at zero because of the seek (wrong one), and finally the read at point # 2 doesn't see the first write. If that's the problem, any idea how to do it?

Thanks in advance!

nDeckard
  • 11
  • 4
  • 2
    Is there a reason you can't use [`subprocess.PIPE`](https://docs.python.org/2/library/subprocess.html#subprocess.PIPE), which would allow you to read/write from `process.stdout`/`process.stdin` directly? – dano Jul 07 '14 at 18:21
  • Dano, as far as I know those objects are written when the process ends so I cannot use them to get the output in "real time". – nDeckard Jul 08 '14 at 17:22

2 Answers2

0

Getting realtime data from Popen is quite finicky. In addition, good old print also generally buffers output.

The following is adapted from another SO question. It's set up so you can just run it.

source

#!/usr/bin/env python

# adapted from http://stackoverflow.com/questions/2804543/read-subprocess-stdout-line-by-line

import subprocess, sys, time

def test_proc(cmd):
    start = time.time()

    print 'COMMAND:',cmd

    proc = subprocess.Popen(
            cmd,
            shell=True,
            stdout=subprocess.PIPE,
    )
    for line in iter(proc.stdout.readline, ''):
        sys.stdout.write('{:.2f}  {}\n'.format(
            time.time() - start,
            line.rstrip()
        ))
        sys.stdout.flush()

if __name__=='__main__':
    test_proc('ping -c3 8.8.8.8')

output

COMMAND: ping -c3 8.8.8.8
0.05  PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
0.05  64 bytes from 8.8.8.8: icmp_seq=1 ttl=43 time=49.0 ms
1.05  64 bytes from 8.8.8.8: icmp_seq=2 ttl=43 time=46.8 ms
2.05  64 bytes from 8.8.8.8: icmp_seq=3 ttl=43 time=49.1 ms
2.05  
2.05  --- 8.8.8.8 ping statistics ---
2.05  3 packets transmitted, 3 received, 0% packet loss, time 2003ms
2.05  rtt min/avg/max/mdev = 46.826/48.334/49.143/1.097 ms

The numbers on the left side are seconds since the start of the command. Since the ping command sends out a packet once per second, this verifies that the data is processed in realtime. The two "0.05" lines are spit put when ping starts. Next line is after a second, when the first ping response is received. At time 2.05 the last response is received, and ping outputs a footer.

Very convenient!

johntellsall
  • 14,394
  • 4
  • 46
  • 40
0

I just figured out how to fix that. An easy solution is just open the temporary file twice, one to be passed to the subprocess and the other one to actually do the reads. In addition, you can use the tempfile module to create a named temporary file like this:

import tempfile

tf_in = tempfile.TemporaryFile()
tf_in.write(some_stdin)
tf_in.seek(0)

tf_out = tempfile.NamedTemporaryFile()
tf_out2 = open(tf_out.name, 'r')
tf_err = tempfile.NamedTemporaryFile()
tf_err2 = open(tf_err.name, 'r')

# create the process
process = subprocess.Popen(command, shell=True,
            stdin=tf_in,
            stdout=tf_out,
            stderr=tf_err)

# same loop but reading from tf_out2 and tf_err2

# close files

Using two file objects for the writes and the reads solves the problem.

Thanks for the answers anyway!

nDeckard
  • 11
  • 4