2

I'm having multiple problems with a python (v3.7) script using multiprocessing (as mp hereafter). One of them is that my computations end with an "OSError: [Errno 24] Too many open files". My scripts and modules are complex, so I've broken down the problem to the following code:

def worker(n):
     time.sleep(1)

n = 2000

procs = [mp.Process(target=worker, args=(i,)) for i in range(n)]
nprocs = 40
i = 0

while i<n:
    if (len(mp.active_children())<=nprocs):
        print('Starting proc {:d}'.format(i))
        procs[i].start()
        i += 1
    else:
        time.sleep(1)
        
[p.join() for p in procs]

This code fails when approx ~ 1020 processes have been excecuted. I've always used multiprocessing in a similar fashion without running into this problem, I'm running this on a serveur with ~ 120 CPU. Lately I've switch from Python 2.7 to 3.7, I don't know if that can be an issue.

Here's the full trace:

Traceback (most recent call last):
  File "test_toomanyopen.py", line 18, in <module>
    procs[i].start()
  File "/p/jqueryrel/local_install/conda_envs/trois/lib/python3.7/multiprocessing/process.py", line 112, in start
    self._popen = self._Popen(self)
  File "/p/jqueryrel/local_install/conda_envs/trois/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/p/jqueryrel/local_install/conda_envs/trois/lib/python3.7/multiprocessing/context.py", line 277, in _Popen
    return Popen(process_obj)
  File "/p/jqueryrel/local_install/conda_envs/trois/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
  File "/p/jqueryrel/local_install/conda_envs/trois/lib/python3.7/multiprocessing/popen_fork.py", line 69, in _launch
    parent_r, child_w = os.pipe()
OSError: [Errno 24] Too many open files

I've seen a similar issue here, but I don't see how I can solve this.

Thanks

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
embrouille
  • 43
  • 1
  • 6
  • What does `ulimit -n` say (assuming it's Linux/Unix)? – bereal May 03 '21 at 08:39
  • 1024… that's a big coincidence – embrouille May 03 '21 at 08:40
  • Perhaps instead of having every process communicate directly to the main process you can have some intermediate processes each responsible for communicating with 256 child processes to limit the number of open pipes – mousetail May 03 '21 at 08:42
  • 2
    It's not a coincidence ^^ – Iguananaut May 03 '21 at 08:42
  • Are you actually starting the processes in the module-level of your code? – Iguananaut May 03 '21 at 08:44
  • yes in a module, not in the main script. – embrouille May 03 '21 at 08:45
  • That's not a coincidence, it's a limit of how many open file descriptor a process may have, and pipes count as files. It can be configured, but in practice, you will hardly ever need that many child processes. – bereal May 03 '21 at 08:46
  • I'm not really versed into multiprocessing... So the problem is that my processes are child processes? (don't really know what those are...). How should I approach the problem? (without changing the limit, because I'm pretty sure I don't have the rights to do so...) – embrouille May 03 '21 at 08:50
  • Please just use a Pool. There is no reason at all to do this manually. – MisterMiyagi May 03 '21 at 08:51
  • In my real computations, I need to save the results of the computations in Queues, that's the reason why I didn't used a Pool (I thaught it was not possible in this case, but maybe I'm wrong). But does using a Pool will solve this problem? – embrouille May 03 '21 at 08:52
  • 1
    Yes. Yes it will. If you have a "real problem" that might prevent you from using a Pool, please ask about that. – MisterMiyagi May 03 '21 at 08:54
  • 1
    Some background: "Too many open files" does not mean literal files, it means file *handles* – including pipes and sockets. ``multiprocessing`` relies on pipes/sockets to communicate between the processes. By manually managing the processes, you do not clean up completed processes *and their handles* in time. A ``Pool`` will do that properly, on top of recycling processes – both keep the number of (file handle) resources needed for process management low. – MisterMiyagi May 03 '21 at 08:56
  • ok I understand. Indeed the Pool thing works in this case, thanks! The problem is my code is quite complex and I'm not sure I'll be able to change it using a Pool, but I'm going to think about it (not all processes are of equal importance, some are launched only when others have producted their results and a specific queue emptied).. Isn't there's a way to close the processes handles manually once their work has been done ? – embrouille May 03 '21 at 09:04
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/231873/discussion-between-embrouille-and-mistermiyagi). – embrouille May 03 '21 at 09:49

1 Answers1

0

To put the comments into an answer, several options to fix this:

  • Increase the limit of possible open file handles. Edit /etc/security/limits.conf. E.g. see here.
  • Don't spawn so many processes. If you have 120 CPUs, it doesn't really make sense to spawn more than 120 procs.
    • Maybe using Pool might be helpful to restructure your code.
Albert
  • 65,406
  • 61
  • 242
  • 386