If the problem is too many files get opened, then you have to set the FD_CLOEXEC
flag on the file descriptors to get them to close when exec
happens. Here is a piece of code that simulates hitting the file descriptor limit while reloading and which contains a fix for not hitting the limit. If you want to simulate a crash, set fixit
to False
. When fixit
is True
, the code goes through the list of file descriptors and sets them as FD_CLOEXEC
. This works on Linux. People working on systems that don't have /proc/<pid>/fd/
will have to find a system-appropriate way to list the open file descriptors. This question may help.
import os
import sys
import fcntl
pid = str(os.getpid())
def fds():
return os.listdir(os.path.join("/proc", pid, "fd"))
files = []
print "Number of files open at start:", len(fds())
for i in xrange(0, 102):
files.append(open("/dev/null", 'r'))
print "Number of files open after going crazy with open()", len(fds())
fixit = True
if fixit:
# Cycle through all file descriptors opened by our process.
for f in fds():
fd = int(f)
# Transmit the stds to future generations, mark the rest as close-on-exec.
if fd > 2: .
try:
fcntl.fcntl(fd, fcntl.F_SETFD, fcntl.FD_CLOEXEC)
except IOError:
# Some files can be closed between the time we list
# the file descriptors and now. Most notably,
# os.listdir opens the dir and it will probably be
# closed by the time we hit that fd.
pass
print "reloading"
python = sys.executable
os.execl(python, python, *sys.argv)
With this code, what I get on stdout are these 3 lines repeated until I kill the process:
Number of files open at start: 4
Number of files open after going crazy with open() 106
reloading
How the code works
The code above gets the list of open file descriptors through the fds()
function. On a Linux system the file descriptors opened by a specific process are listed at:
/proc/<process id of the process we want>/fd
So if your process id of your process is 100 and you do:
$ find /proc/100/fd
You'll get a list like:
/proc/100/fd/0
/proc/100/fd/1
/proc/100/fd/2
[...]
The fds()
function just gets the basename of all the these files ["0", "1", "2", ...]
. (A more general solution might convert them to integers right away. I chose not to do that.)
The second key part is setting FD_CLOEXEC
on all the file descriptors except std{in,out,err}
. Setting FD_CLOEXEC
on a file descriptor tells the operating system that next time exec
is executed, the OS should close the file descriptor before giving control to the next executable. This flag is defined on the man page for fcntl.
In an application that uses threads that open files, it is possible for the code I have above to miss setting FD_CLOEXEC
on some file descriptors if a thread executes between the time the list of file descriptors is obtained and the time exec
is called and this thread opens new files. I believe the only way to ensure that this does not happen would be to replace os.open
with code that calls the stock os.open
and then set FD_CLOEXEC
right away on the file descriptor returned.