0

I would like to count the number of lines written to stdout by a process (here unrar.exe) created with Popen.

import time
from subprocess import Popen, PIPE, STDOUT

p = Popen('unrar.exe x -y myfile.rar', stdout=PIPE)

while (p is not finished):      # pseudo code here and next lines...
    time.sleep(0.100)
    print 'Number of lines written to STDOUT by unrar' + len(PIPE.split('\n'))

How to do this properly ?

Remark : I already looked at p.communicate() (https://python.readthedocs.org/en/v2.7.2/library/subprocess.html#subprocess.Popen.communicate) but this has the effect of blocking the execution of the Python until p has terminated, which is not what I want : I want to be able to print the number of lines written by p when it's running.

Basj
  • 41,386
  • 99
  • 383
  • 673
  • possible duplicate of [Non-blocking read on a subprocess.PIPE in python](http://stackoverflow.com/questions/375427/non-blocking-read-on-a-subprocess-pipe-in-python) – Martijn Pieters Feb 24 '14 at 13:41
  • @MartijnPieters I'm looking for an easier solution without using another thread, etc. I don't mind if the process is blocked every 100 ms – Basj Feb 24 '14 at 13:49
  • @Basj: you can't do it without blocking unless you use threads or `select` (`poll`, `kqueue`, named pipe on Windows) or `fcntl` or similar. – jfs Feb 24 '14 at 15:08

2 Answers2

1

I'm looking for an easier solution without using another thread, etc. I don't mind if the process is blocked every 100 ms

If it is a hard requirement that the process must not block then you need threads (or other asynchronous techniques). To emulate non-blocking wc --lines <(cmd):

#!/usr/bin/env python
import io
import shlex
from functools import partial
from subprocess import Popen, PIPE
from threading import Thread
from Queue import Queue

def count_lines(pipe, queue, chunksize=io.DEFAULT_BUFFER_SIZE):
    #NOTE: you could write intermediate results here (just drop `sum()`)
    queue.put(sum(chunk.count(b'\n')
                  for chunk in iter(partial(pipe.read, chunksize), b'')))
    pipe.close()

p = Popen(shlex.split('unrar.exe x -y myfile.rar'), stdout=PIPE, bufsize=-1)
result = Queue()
Thread(target=count_lines, args=[p.stdout, result]).start()
p.wait() # you can omit it if you want to do something else and call it later
number_of_lines = result.get() # this blocks (you could pass `timeout`)

On the other hand if all you need is "to print the number of lines written by p when it's running." then you could count lines in the main thread:

#!/usr/bin/env python
import shlex
from subprocess import Popen, PIPE

p = Popen(shlex.split('unrar.exe x -y myfile.rar'), stdout=PIPE, bufsize=1)
count = 0
for line in iter(p.stdout.readline, b''):
    count += 1
    print count
p.stdout.close()
p.wait()
jfs
  • 399,953
  • 195
  • 994
  • 1,670
0

I don't know if this is the cleanest way to do it, but it works:

from subprocess import Popen, PIPE
import io, time

p = Popen('unrar.exe x -y myfile.rar', stdout = PIPE)

b = ' '
i = 0 
while b:
    b = p.stdout.readline()
    i += 1
    print i

print 'FINISHED'    
Basj
  • 41,386
  • 99
  • 383
  • 673
  • @J.F.Sebastian : in which situation may it be blocked at `readline()` ? If there is no more lines, wouln't it give `''` and then wouldn't this end the loop ? – Basj Feb 25 '14 at 00:29
  • `p.stdout.readline()` blocks until a line is available or EOF e.g., in a typical case of a block buffered output to a pipe you won't see anything until the pipe buffer overflows or it is flushed (print the current time to see that the parent sees the output in bursts). If child process doesn't produce much output; it may take a while before you see the first line. – jfs Feb 25 '14 at 01:59