14

I have 5 processes p1,p2,...,p5 where I want to write some data to stdin of p1, pipe p1 output to p2 stdin and finally read the final result from output of p5.

What I have tried so far:

p1 = Popen(['p1'], stdin=PIPE, stdout=PIPE)
p2 = Popen(['p2'], stdin=p1.stdout, stdout=PIPE)
...
p5 = Popen(['p5'], stdin=p4.stdout, stdout=PIPE)

# write data to stdin
p1.stdin.write(indata)
p1.stdin.close()

# not sure in what order to close the pipes here, if at all

# read output
out = p5.stdout.read()
print out

The last code snipped just hangs, because I must be doing the read/write operations incorrectly.

I was able to get working single processes using communicate() and two processes without suppling any input to first one (example from Python docs):

output=`dmesg | grep hda`
==>
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close()  # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]

But I cannot figure out, how to supply input to first process without hanging the interpreter.

I could also use a bash script for that (that I've already written and works), but I'd like to know, how to achieve same with Python.

So, I'd like to ask, how do do all that properly, specifically in what order to do the read/write/close operations on pipes?

I'm working on 64-bit Linux, if that matters.

EDIT: I forgot to mention that all processes p1,..p5 consume all input they are given, process it, write to stdout and then terminate. Thus, the processes next in pipeline should not terminate before the previous ones have finished processing.

EDIT2: I know that I could also use

command = 'bash -c "p1 | p2 | p3 | p4 | p5"'
proc = Popen([command], shell=True)
out, err = proc.communicate(input=indata)
print out

but my main interest is getting know how to chain the pipes purely in python code.

Timo
  • 5,188
  • 6
  • 35
  • 38
  • There's a related question here: http://stackoverflow.com/q/295459/1858225 It appears that using an explicit temp file (as in the accepted answer here) is *not* necessary; however, there doesn't seem to be any straightforward and purely Pythonic way to do this, which surprises me. Plumbum (mentioned in one of the answers) looks somewhat nice, but far too "magic" for me (this is Python, not Perl!). The methods for accomplishing this entirely using `subprocess` (e.g. http://sam.nipl.net/code/python/pipeline.py which is from a comment on another answer) appear to be prone to strange errors. – Kyle Strand Dec 04 '14 at 17:53
  • ....actually, I just discovered the `pipes` module (https://docs.python.org/2/library/pipes.html) and added an answer to the other question accordingly. It looks much better than the other solutions. – Kyle Strand Dec 04 '14 at 18:53

1 Answers1

10

Maybe this can help:

import sys
import tempfile
from subprocess import Popen, PIPE


cmd = [sys.executable, '-c', 'print raw_input()']

# Using a temp file to give input data to the subprocess instead of stdin.write to avoid deadlocks.
with tempfile.TemporaryFile() as f:
    f.write('foobar')
    f.seek(0)  # Return at the start of the file so that the subprocess p1 can read what we wrote.
    p1 = Popen(cmd, stdin=f, stdout=PIPE)

p2 = Popen(cmd, stdin=p1.stdout, stdout=PIPE)
p3 = Popen(cmd, stdin=p2.stdout, stdout=PIPE)

# No order needed.
p1.stdout.close()
p2.stdout.close()

# Using communicate() instead of stdout.read to avoid deadlocks. 
print p3.communicate()[0]

Output:

$ python test.py
foobar

Hope this can be hepfull.

mouad
  • 67,571
  • 18
  • 114
  • 106
  • Thank you for your solution. It is very clever and works. If I understand now, there is no way to do input without real file descriptors? For example, using StringIO file objects does not work, because there is no fileno? – Timo Jun 14 '11 at 11:06
  • @Timo: Yes that true you need a real file with fileno, and glad it was helpful :) – mouad Jun 14 '11 at 12:39
  • Heh, as soon as I read the question I thought "I'm sure the answer will involve file I/O somehow..." – JAB Jun 14 '11 at 13:44
  • @JAB: Yes , the subprocess module don't give us much choices :) – mouad Jun 14 '11 at 13:53
  • Python 3 note: `print(input())` and `f.write(b'foobar')` (or `with tempfile.TemporaryFile('w')`). That said, I couldn't check if it worked completely or not in my script, as in my case I was setting config values on Ubuntu with `cmd=["gsettings", "set", gsettings_schema, gsettings_key, value]` twice in a row, but the final value was sometimes the 1st, sometimes the 2nd. – hsandt Aug 26 '17 at 18:17