6

We are having some problems with the dreaded "too many open files" on our Ubuntu Linux machine rrunning a python Twisted application. In many places in our program, we are using subprocess Popen, something like this:

Popen('ifconfig ' + iface, shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT, close_fds=True)
output = process.stdout.read()

while in other places we use subprocess communicate:

process = subprocess.Popen(['/usr/bin/env', 'python', self._get_script_path(script_name)],
                       stdin=subprocess.PIPE,
                       stdout=subprocess.PIPE,
                       close_fds=True)
out, err = process.communicate(data)

What exactly do I need to do in both cases in order to close any open file descriptors? Python documentation is not clear on this. From what I gather (which could be wrong) both communicate() and wait() will indeed clean up any open fds on their own. But what about Popen? Do I need to close stdin, stdout, and stderr explicitly after calling Popen if I don't call communicate or wait?

Marc
  • 3,386
  • 8
  • 44
  • 68
  • 3
    You can use the value returned by `Popen()` as a context manager (only supported in 3.2 or later though). – Brian Cain Dec 12 '14 at 20:05

2 Answers2

6

According to this source for the subprocess module (link) if you call communicate you should not need to close the stdout and stderr pipes.

Otherwise I would try:

process.stdout.close()
process.stderr.close()

after you are done using the process object.

For instance, when you call .read() directly:

output = process.stdout.read()
process.stdout.close()

Look in the above module source for how communicate() is defined and you'll see that it closes each pipe after it reads from it, so that is what you should also do.

ErikR
  • 51,541
  • 9
  • 73
  • 124
  • 2
    You should avoid calling `process.stdout.read()` in almost all situations. It can cause deadlock if the buffer for stdin empties or the buffer for stderr becomes full whilst the parent process is still trying to read from stdout. – Dunes Dec 12 '14 at 22:15
  • That's good advice. `communicate()` closes he stdin pipe before calling `stdout.read()` and this prevents most deadlocks. However, you can still deadlock if you want to get both stdout and stderr. My usual practice (now that I remember it) is to redirect both stdout and stderr to temporary files and/or `/dev/null` and capture the outputs that way. – ErikR Dec 12 '14 at 23:55
  • Communicate doesn't result in deadlock. Doing all your writing to stdin before reading from stdout/err can also result in deadlock. What if the child process writes a lot of data before it reads any in and the parent process is writing a lot of data. The child process will get stuck trying to write, whilst the parent process is stuck trying to write. I believe communicate uses a selector to make sure stdin, stdout and stderr are written to and read from in a timely fashion. See subprocess.Popen._communicate. – Dunes Dec 13 '14 at 10:14
  • 3
    Have you considered using `reactor.spawnProcess` instead? This might benefit your software in several ways. – Jean-Paul Calderone Dec 13 '14 at 12:37
-1

If you're using Twisted, don't use subprocess. If you were using spawnProcess instead, you wouldn't need to deal with annoying resource-management problems like this.

Glyph
  • 31,152
  • 11
  • 87
  • 129