2

Code snippet from: http://docs.python.org/3/library/subprocess.html#replacing-shell-pipeline

output=`dmesg | grep hda`
# becomes
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close()  # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]

Question: I do not quite understand why this line is needed: p1.stdout.close()? What if by doing this p1 stdout is closed even before it is completely done outputting data and p2 is still alive ? Are we not risking that by closing p1.stdout so soon? How does this work?

Ankur Agarwal
  • 23,692
  • 41
  • 137
  • 208
  • possible duplicate of [Explain example from python subprocess module](http://stackoverflow.com/questions/6046779/explain-example-from-python-subprocess-module) – Daniel Pryden Jan 07 '15 at 06:36

2 Answers2

2

p1.stdout.close() closes Python's copy of the file descriptor. p2 already has that descriptor open (via stdin=p1.stdout), so closing Python's descriptor doesn't affect p2. However, now that pipe end is only opened once, so when it closes (e.g. if p2 dies), p1 will see the pipe close and will get SIGPIPE.

If you didn't close p1.stdout in Python, and p2 died, p1 would get no signal because Python's descriptor would be holding the pipe open.

nneonneo
  • 171,345
  • 36
  • 312
  • 383
1

Pipes are external to processes (its an operating system thing) and are accessed by processes using read and write handles. Many processes can have handles to the pipe and can read and write in all sorts of disastrous ways if not managed properly. Pipes close when all handles to the pipes are closed.

Although process execution works differently in Linux and Windows, Here is basically what happens (I'm going to get killed on this!)

p1 = Popen(["dmesg"], stdout=PIPE)

Create pipe_1, give a write handle to dmesg as its stdout, and return a read handle in the parent as p1.stdout. You now have 1 pipe with 2 handles (pipe_1 write in dmesg, pipe_1 read in the parent).

p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)

Create pipe_2. Give grep a write handle to pipe_2 and a copy of the read handle to pipe_1. You now have 2 pipes and 5 handles (pipe_1 write in dmesg, pipe_1 read and pipe_2 write in grep, pipe_1 read and pipe_2 read in the parent).

p1.stdout.close()  # Allow p1 to receive a SIGPIPE if p2 exits.

Notice that pipe_1 has two read handles. You want grep to have the read handle so that it reads dmesg data. You don't need the handle in the parent any more. Close it so that there is only 1 read handle on pipe_1. If grep dies, its pipe_1 read handle is closed, the operating system notices there are no remaining read handles for pipe_1 and gives dmesg the bad news.

output = p2.communicate()[0]

dmesg sends data to stdout (the pipe_1 write handle) which begins filling pipe_1. grep reads stdin (the pipe_1 read handle) which empties pipe_1. grep also writes stdout (the pipe_2 write handle) filling pipe_2. The parent process reads pipe_2... and you got yourself a pipeline!

tdelaney
  • 73,364
  • 6
  • 83
  • 116