1

Possible Duplicate:
subprocess.Popen.stdout - reading stdout in real-time, again!

I am processsing the ouptut of a file in binary but I am using a temporary string to represent the output. Since the output could be in theory fairly large, I would prefer to process the output as a stream using unpack or unpack_from.

The code is something like this:

file = '/home/t/FinancialData/GBPUSD/2007/05/01/20070501_01h_ticks.bi5';
command = ('lzma', '-kdc', '-S', 'bi5', file);
p = subprocess.Popen(command, stdout=subprocess.PIPE);
out, err = p.communicate();
for s in (out[x:x+20] for x in range(0, len(out), 20)):
    values = struct.unpack(">3L2f", s)
    with open(csvfilename, 'wb') as csvfile:
        csvwriter = csv.writer(csvfile, delimiter=',',
                               quotechar='|', quoting=csv.QUOTE_MINIMAL)
        csvwriter.writerow(values);

Is there a way to rewrite this so it does not have to store the whole output in out but process it as a stream ?

Community
  • 1
  • 1
BlueTrin
  • 9,610
  • 12
  • 49
  • 78
  • Have you seen other StackOverflow questions similar to the one you have asked already? Especially, I think http://stackoverflow.com/questions/3140189/subprocess-popen-stdout-reading-stdout-in-real-time-again might help. – Siddharth Toshniwal Oct 31 '12 at 10:45
  • Sorry I ll vote to close as a dupe – BlueTrin Oct 31 '12 at 10:58

2 Answers2

1

You can read from the file object p.stdout:

while True:
    s = p.stdout.read(20)
    if not s:
        break
    values = struct.unpack(">3L2f", s)
    ...

Note that this approach is only safe if you have at most one pipe on the Popen object; any more and the process could block waiting for input or writing to stderr. In that case you should use poll, select or threading to multiplex the pipes.

ecatmur
  • 152,476
  • 27
  • 293
  • 366
1

You can put a select call around the stdout attribute of the Popen object and poll until the process completes. For example:

from subprocess import Popen, PIPE
from select import select

cmd = ('lzma', '-kdc', '-S', 'bi5', 'path/to/datafile')
p = Popen(cmd, stdout=PIPE)

while p.poll() == None:
    r,w,e = select([p.stdout], [], [])
    if r:
        data = p.stdout.read(512)
        # unpack and append to csv file ...

Cheers,

gvalkov
  • 3,977
  • 30
  • 31