2

I want to make a Python wrapper for another command-line program.

I want to read Python's stdin as quickly as possible, filter and translate it, and then write it promptly to the child program's stdin.

At the same time, I want to be reading as quickly as possible from the child program's stdout and, after a bit of massaging, writing it promptly to Python's stdout.

The Python subprocess module is full of warnings to use communicate() to avoid deadlocks. However, communicate() doesn't give me access to the child program's stdout until the child has terminated.

Will
  • 73,905
  • 40
  • 169
  • 246
  • please post a snippet of how you are attempting, thnx. – user1269942 Oct 16 '14 at 20:48
  • @user1269942 I don't know which API to use. Reading through `subprocess`, none of them fit. – Will Oct 16 '14 at 20:49
  • 1
    related to the buffering issue: [Python C program subprocess hangs at “for line in iter”](http://stackoverflow.com/q/20503671/4279) – jfs Oct 18 '14 at 03:31

2 Answers2

1

Disclaimer: This solution likely requires that you have access to the source code of the process you are trying to call, but may be worth trying anyways. It depends on the called process periodically flushing its stdout buffer which is not standard.

Say you have a process proc created by subprocess.Popen. proc has attributes stdin and stdout. These attributes are simply file-like objects. So, in order to send information through stdin you would call proc.stdin.write(). To retrieve information from proc.stdout you would call proc.stdout.readline() to read an individual line.

A couple of caveats:

  • When writing to proc.stdin via write() you will need to end the input with a newline character. Without a newline character, your subprocess will hang until a newline is passed.
  • In order to read information from proc.stdout you will need to make sure that the command called by subprocess appropriately flushes its stdout buffer after each print statement and that each line ends with a newline. If the stdout buffer does not flush at appropriate times, your call to proc.stdout.readline() will hang.
Vorticity
  • 4,582
  • 4
  • 32
  • 49
  • 1
    I found I could trick most child programs by giving them a `pty.openpty()` stdout. This tricks them into doing line-buffering rather than big block buffering. – Will Oct 16 '14 at 21:36
1

I think you'll be fine (carefully) ignoring the warnings using Popen.stdin, etc yourself. Just be sure to process the streams line-by-line and iterate through them on a fair schedule so not to fill up any buffers. A relatively simple (and inefficient) way of doing this in Python is using separate threads for the three streams. That's how Popen.communicate does it internally. Check out its source code to see how.

5gon12eder
  • 24,280
  • 5
  • 45
  • 92
  • As noted by @Vorticity below, programs attached to a PIPE usually buffer their output. To get around this, I used `pty.openpty` to create a master slave for the child's stdout. You set the subprocess.Popen's stdout to the slave, and use a thread to read from the master. – Will Oct 16 '14 at 21:35
  • @Will Oh, okay, that was your problem. Sorry, I didn't get that but your solution sounds good. – 5gon12eder Oct 16 '14 at 21:39
  • @Will: Don't ignore the warning unless you understand it very well otherwise a program that passes all tests may hang in production (it is not hard to understand, just make sure that you do). `.communicate()` uses threads only on Windows otherwise a [select loop is used](https://hg.python.org/cpython/file/tip/Lib/subprocess.py#l1603) – jfs Oct 18 '14 at 03:29
  • @J.F.Sebastian thx for the warning. I am now on top of the situation. I know what's happening under the hood. – Will Oct 18 '14 at 09:10