22

I need a execute a command that produces a lot of output and takes a lot of time to execute (> 30 minutes). I was thinking of using subprocess.Popen to do it. I need to capture the output of the command, so I pass PIPE to stdout and stderr.

A deadlock problem when using Popen.wait() is well documented on a lot of forums, so Popen.communicate() is the proposed way of avoiding the deadlock. The problem with that solution is that communicate() blocks until the command is completed. I need to print everything that arrives on stdout while the command is executed. If there is no output after 20 minutes, the script execution will be killed.

Here are some constraints that I need to respect:

  • My Python version is 2.4.2 and I can't upgrade.
  • If the solution is still to use subprocess, I need to pass subprocess.PIPE to all std handles to avoid this bug: http://bugs.python.org/issue1124861

Is there a way to do it?

rypel
  • 4,686
  • 2
  • 25
  • 36
GDICommander
  • 1,273
  • 3
  • 15
  • 27
  • 1
    (Coming from google?) all PIPEs will deadlock when one of the PIPEs' buffer gets filled up and not read. e.g. stdout deadlock when stderr is filled. Never pass a PIPE you don't intend read. – Nasser Al-Wohaibi May 07 '14 at 11:07

5 Answers5

13

import os
from subprocess import PIPE, STDOUT, Popen

lines = []
p = Popen(cmd, bufsize=1, stdin=open(os.devnull), stdout=PIPE, stderr=STDOUT)
for line in iter(p.stdout.readline, ''):
      print line,          # print to stdout immediately
      lines.append(line)   # capture for later
p.stdout.close()
p.wait()
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • 1
    This was what I needed. Thanks a lot! You just solved a problem that took me a complete working day to investigate! – GDICommander Apr 08 '11 at 12:43
  • 2
    @GDICommander: beware, the code might drop stderr under [Wine](http://www.winehq.org/). It works fine on Ubuntu. Make sure to test it on Windows. – jfs Apr 09 '11 at 09:49
  • if i want to send input ? example : p.stdin.write("YES") – bora.oren Jul 24 '14 at 14:47
  • @baybora.oren just set stdin=PIPE. Though in general if you want to send input *and* to receive output concurrently then you should be very careful to avoid a deadlock due to any of OS pipe buffers filling up: make sure the other side reads when you write and in reverse: threads, asyncio solve this problem in general. In addition, pexpect solves a block-buffering issue and input/output outside stdin/stdout issue. – jfs Jul 24 '14 at 15:07
6

Have you tried pexpect?

Zaur Nasibov
  • 22,280
  • 12
  • 56
  • 83
3

It sounds like you need to do a non-blocking read on the filehandles attached to the pipes.

This question addresses some ways to do that for windows & linux: Non-blocking read on a subprocess.PIPE in python

Community
  • 1
  • 1
Eli Collins
  • 8,375
  • 2
  • 34
  • 38
1

To avoid the pipe buffers filling up, just launch a background thread in the parent process. That thread can either just continuously read from stdout (and stderr) to keep the pipe buffers from filling up, or you can invoke communicate() from it. Either way, the main thread is free to continue with ordinary processing and the child process won't block on an output operation.

Converting a synchronous IO operation into an asynchronous one (from the point of view of the main thread) is one of the best use cases for threads. Even async frameworks like Twisted will sometimes use it as a last resort solution when no native asynchronous interface is available for a given operation.

ncoghlan
  • 40,168
  • 10
  • 71
  • 80
0

You might consider using multiple threads. Assign one thread to read from stdout, one from stderr, and use a third thread to detect the timeout:

while time.time() - last_output_time < 20 * 60:
    time.sleep( 20 * 60 - (time.time() - last_output_time) )
print 'No output detected in the last 20 minutes. Terminating execution'
sys.exit(1)
Rakis
  • 7,779
  • 24
  • 25