Should I use Popen's wait or communicate to read stdout in subprocess in Python 3?

Question

I am trying to run a subprocess in Python 3 and constantly read the output.

In the documentation for subprocess in Python 3 I see the following:

Popen.wait(timeout=None)

Wait for child process to terminate. Set and return returncode attribute. Warning This will deadlock when using stdout=PIPE and/or stderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that.

Which makes me think I should use communicate as the amount of data from stdout is quite large. However, reading the documentation again shows this:

Popen.communicate(input=None, timeout=None)...

Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached.

Note The data read is buffered in memory, so do not use this method if the data size is large or unlimited.

So again, it seems like there are problems with reading starndard out from subprocesses this way. Can someone please tell me the best / safest way to run a subprocess and read all of its (potentially large amount) of stdout?

both of these methods will wait for the process to finish, answers in the question in above comment will show you ways to process the output in chunks while it is running. — Tadhg McDonald-Jensen, Jun 01 '16 at 20:23

score 2 · Answer 1 · answered Jun 01 '16 at 20:13

I think you should use communicate. The message warns you about performance issues with the default behaviour of the method. In fact, there's a buffer size parameter to the popen constructor that can be tuned to improve a lot performance for large data size.

I hope it will help :)

Should I use Popen's wait or communicate to read stdout in subprocess in Python 3?

1 Answers1