76

I am using subprocess to call another program and save its return values to a variable. This process is repeated in a loop, and after a few thousands times the program crashed with the following error:

Traceback (most recent call last):
  File "./extract_pcgls.py", line 96, in <module>
    SelfE.append( CalSelfEnergy(i) )
  File "./extract_pcgls.py", line 59, in CalSelfEnergy
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
  File "/usr/lib/python3.2/subprocess.py", line 745, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.2/subprocess.py", line 1166, in _execute_child
    errpipe_read, errpipe_write = _create_pipe()
OSError: [Errno 24] Too many open files

Code:

cmd = "enerCHARMM.pl -parram=x,xtop=topology_modified.rtf,xpar=lipid27_modified.par,nobuildall -out vdwaals {0}".format(cmtup[1])
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
out, err = p.communicate()
Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Vahid Mirjalili
  • 6,211
  • 15
  • 57
  • 80
  • 1
    Communicate() closes the pipe, so that's not your problem. In the end, Popen() is just the command that happens to run when you run out of pipes... the problem could be elsewhere in your code with other files being left open. I noticed "SelfE.append" ... are you opening other files and keeping them in a list? – tdelaney May 13 '13 at 17:38
  • did you try doing `ulimit -Sn unlimited` before running your python script? – Charlie Parker Feb 15 '21 at 21:32

10 Answers10

67

In Mac OSX (El Capitan) See current configuration:

#ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 256
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 709
virtual memory          (kbytes, -v) unlimited

Set open files value to 10K :

#ulimit -Sn 10000

Verify results:

#ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 10000
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 709
virtual memory          (kbytes, -v) unlimited
gogasca
  • 9,283
  • 6
  • 80
  • 125
  • 5
    the output of `ulimit -a` looks slightly different as of Oktober 2019 (El Capitan 10.11.6), e.g. -n is now "file descriptors" not "open files" `-n: file descriptors`. But `ulimit -Sn 50000` solved my issue. Thanks. – Wlad Oct 18 '19 at 09:03
  • 1
    why not `ulimit -Sn unlimited`? – Charlie Parker Feb 15 '21 at 21:32
  • 1
    Seconding the above comment. **ulimit -Sn unlimited** makes sense but is it too dangerous for blocking other system processes or something similar? – benjamin deworsop Apr 20 '22 at 01:25
28

I guess the problem was due to the fact that I was processing an open file with subprocess:

cmd = "enerCHARMM.pl -par param=x,xtop=topology_modified.rtf,xpar=lipid27_modified.par,nobuildall -out vdwaals {0}".format(cmtup[1])
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)

Here the cmd variable contain the name of a file that has just been created but not closed. Then the subprocess.Popen calls a system command on that file. After doing this for many times, the program crashed with that error message.

So the message I learned from this is

Close the file you have created, then process it

zabop
  • 6,750
  • 3
  • 39
  • 84
Vahid Mirjalili
  • 6,211
  • 15
  • 57
  • 80
20

You can try raising the open file limit of the OS:

ulimit -n 2048

Matt Sweeney
  • 2,060
  • 1
  • 14
  • 19
  • 7
    Actually that command will not raise the limit above what has been set in `/etc/security/limits.conf`. To raise it, you need to place lines like these `* soft nofile 4096` / `* hard nofile 4096` in that file (replace `4096` with your own value). – Dan D. May 13 '13 at 17:54
  • 1
    Ran into this problem yesterday, and I had to edit both `/etc/security/limits.conf` AND raise the limit via `ulimit -n` in ubuntu to overcome this error. – Chris J. Vargo Feb 21 '18 at 15:55
10

As others have noted, raise the limit in /etc/security/limits.conf and also file descriptors was an issue for me personally, so I did

sudo sysctl -w fs.file-max=100000 

And added to /etc/sysctl.conf:

fs.file-max = 100000

Reload with:

sudo sysctl -p

Also if you want to make sure that your process is not affected by anything else (which mine was), use

cat /proc/{process id}/limits 

to find out what the actual limits of your process are, as for me the software running the python scripts also had its limits applied which have overridden the system wide settings.

Posting this answer here after resolving my particular issue with this error and hopefully it helps someone.

Jerome Jaglale
  • 1,863
  • 18
  • 22
Soniku
  • 154
  • 1
  • 5
6

A child process created by Popen() may inherit open file descriptors (a finite resource) from the parent. Use close_fds=True on POSIX (default since Python 3.2), to avoid it. Also, "PEP 0446 -- Make newly created file descriptors non-inheritable" deals with some remaining issues (since Python 3.4).

jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • I don't think this works, at least in all cases. I generated 1200 sleep spawned processes on a system with a 1024 open file limit (default on Ubuntu) and it blew up even with close_fds=True. So I think there's more to it than that. Since you still have over the limit in open processes anyways and this only works if you assume that the problem lies in finished processes that left open file descriptors. – Sensei Oct 13 '16 at 16:51
  • @Sensei it does work: open files in the parent (make sure the fds are inheritable) then spawn subprocesses with `close_fds=False` (both are default on old Python versions, follow the links). See how much sooner you'll get the error. Obviously `close_fds` can't prevent the error in the general case: you don't even need to spawn a new process, to get it. – jfs Oct 13 '16 at 20:04
  • Except that it doesn't. I ran a simple for loop and generated enough subprocesses to hit the OS limit. I did this with close_fds=True. It had no impact. I may be wrong about why but my guess is simply that this solution only works if you're spawning a few subprocesses and never cleaning up the descriptors. In such a case this argument makes sense but I don't see it working if you actually intend to spawn and run that many processes at once. – Sensei Oct 14 '16 at 15:55
  • 1
    @Sensei: I know that it works because there are tests in the stdlib that exercise this option (i.e., I know that it works not only for me). Now, your code may not work as you expect it. In that case, create a minimal but complete code example, describe the exact behavior that you expect and what happens instead step by step and publish it as a separate SO question (mention OS, Python version). – jfs Oct 14 '16 at 16:00
  • I think there is a misunderstanding between the two of you, I guess what Sensei says is that if you spawn (and don't terminate) many processes it will still crash. Whereas I think what you say is if you spawn many subprocess (that will at some point terminate) then this solution works. I have a case where I just ran many `asyncio.create_subprocess_exec` (most of them sequentially, something like at most 10 opened simultaneously) and I still had a "bug", when I looked at how many descriptors where opened by my script the number was way higher than 10, way way higher. I'm trying with your idea. – cglacet Mar 29 '20 at 11:35
  • Sadly it still crashes … – cglacet Mar 29 '20 at 11:49
  • @cglacet: what happens if you run [`test_close_fds()`](https://github.com/python/cpython/blob/044cf94f610e831464a69a8e713dad89878824ce/Lib/test/test_subprocess.py#L2630-L2675) – jfs Mar 30 '20 at 17:11
  • @jfs On python 3.7.3 (which is the one I used before) it crashes with the following error **ValueError: current limit exceeds maximum limit** on `resource.setrlimit(resource.RLIMIT_STACK, (newsoft, hard))`. But I seem to have this error for many tests (I tried `test_import`, same result). If I switch to another python version (3.8.0 for example) it works well. – cglacet Mar 31 '20 at 09:27
  • Found this : https://bugs.python.org/issue36432 (https://bugs.python.org/issue34602), I'll try to find what my problem is with running tests under 3.7.3 and I'll come with the results as soon as I can. – cglacet Mar 31 '20 at 10:02
  • My code also crashes on python 3.8.0 (I have an issue [here](https://stackoverflow.com/q/60928873/1720199) in case you need more details on what I'm trying to do). – cglacet Mar 31 '20 at 11:27
  • @cglacet I'm on Linux, I didn't see these failures. – jfs Mar 31 '20 at 16:57
  • @jfs I'm on OSX (10.14.6), but maybe that's due to my python 3.7.3 install (pyenv install). If it's matter to you I can try to re-install a clean version. – cglacet Apr 01 '20 at 12:33
5

Maybe you are invoking the command multiple times. If so, each time you're doing stdout=subprocess.PIPE. Between each call try doing p.stdout.close().

user2653663
  • 2,818
  • 1
  • 18
  • 22
Jyogo
  • 51
  • 1
  • 1
5

Use context managers instead:

cmd = "enerCHARMM.pl -param=x,xtop=topology_modified.rtf,xpar=lipid27_modified.par,nobuildall -out vdwaals {0}".format(cmtup[1])
with subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True) as p:
    out, err = p.communicate()

This will close p.stdout and p.stderr after the last line.

Related codes in Python: https://github.com/python/cpython/blob/208a7e957b812ad3b3733791845447677a704f3e/Lib/subprocess.py#L1031-L1038

Related document: https://docs.python.org/3/library/subprocess.html#subprocess.Popen

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Jinzhe Zeng
  • 53
  • 2
  • 7
3

If you are working on Linux you can easily debug this problem

1 - Start the command that eventually will fail due Too Many Open Files in a terminal.

python -m module.script

2 - Let it run for a while (so it can start opening the actual files) and whenever you believe it has done so just press CTRL+Z so the process will be suspended. You will have an output with the process id.

^Z
[2]  + 35245 suspended  python -m module.script

35245 is your PID.

3 - Now you can check what files are actually opened and not closed.

ls -alht /proc/35245/fd/

In my case I was doing something very similar to the original post, but I was creating a temporarily file with tempfile.mkstemp() before adding some data and actually running the subprocess.Popen.

In this case you need to close the file twice, once for adding the information and the second one due mkstemp

fd, path = tempfile.mkstemp()
with open(path, "wb") as f:
    f.write(bytes('my data', encoding='utf8'))
    f.close()   # this is one time
process = subprocess.Popen("my command that requires the previous file" ,[...])
os.close(fd)   # this is second time and the one I missed
Tk421
  • 6,196
  • 6
  • 38
  • 47
  • 1
    Thanks for the command to look into open files using the PID, very useful. However, I have one question about your code snippet: Isn't the whole point of using `with` with `open` that it will automatically close the file at the end of the block? Why are you closing it explicitly inside the `with` block? – lotif Aug 18 '22 at 15:40
  • 1
    @Iotif you are right. `with` with `open` should close the file at the end of the block. I would have to test again to check, but answers as this one https://stackoverflow.com/a/50113736/998649 would definitely create the problem. – Tk421 Aug 29 '22 at 06:21
2

Raise the limit to e.g. 32768 by adding the following lines to /etc/security/limits.conf:

* soft nofile 32768
* hard nofile 32768

Then, run ulimit -n 32768 as well.

Source: Dan D.'s comment.

Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
-2

opens file in subprocess. It is blocking call.

ss=subprocess.Popen(tempFileName,shell=True)
 ss.communicate()
imp
  • 1,967
  • 2
  • 28
  • 40