0

Why is my maximum number of workers in Pool set to 60? Shouldn’t it be even lower, considering I only got 8 core CPU with 16 threads? For some reason it accepts 60 and it runs fine, not using up even 30% of the CPU, despite competing for resources and switching context.

As soon as I set to something like Pool(61) or higher, I get an error like this:

Exception in thread Thread-4:
Traceback (most recent call last):
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\threading.py", line 954, in _bootstrap_inner
    self.run()
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\threading.py", line 892, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 519, in _handle_workers
    cls._wait_for_updates(current_sentinels, change_notifier)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 499, in _wait_for_updates
    wait(sentinels, timeout=timeout)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\multiprocessing\connection.py", line 884, in wait
    ready_handles = _exhaustive_wait(waithandle_to_obj.keys(), timeout)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python39\lib\multiprocessing\connection.py", line 816, in _exhaustive_wait
    res = _winapi.WaitForMultipleObjects(L, False, timeout)
ValueError: need at most 63 handles, got a sequence of length 82

Apparently, this is a Windows limitation, according to this. But why?

If going over CPU thread limit is possible, why won’t Windows allow me to go to 80 or 120, considering I have CPU cycles to spare? Would this limitation still apply, if I installed Linux on this machine? Is there some other way to circumvent this? Do people with Threadripper CPUs also run into same issue on Windows?

miran80
  • 945
  • 7
  • 22

1 Answers1

3

This is a Windows specific limit, tied to the MAXIMUM_WAIT_OBJECTS limit of WaitForMultipleObjects (the limit is 64); you can see in your traceback that the ultimate problem is the call to _winapi.WaitForMultipleObjects; that's Windows-specific code. On Linux you should have no such problems.

There are ways around this limit (it basically involves creating nested hierarchies of handles to wait on), but it's complicated and has limitations; clearly the Python level code hasn't bothered to make use of any of these workarounds. Within Python, I think you're stuck using multiple pools if you want to exceed the limit. Since the limit is on how many handles can be monitored in a single call to WaitForMultipleObjects, not a limit on total processes, multiple pools should work just fine.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • my 2¢... If you need that many processes, probably better to construct your own "Pool" from `mp.Process`, and other constructs. You can sometimes do slightly better in terms of overhead by customizing to your specific application rather than the built-in Pool. (mostly by optimizing what you send and how you send data to the child processes) – Aaron Sep 26 '21 at 07:19
  • @Aaron: Yeah. It's a fairly significant undertaking to rewrite a custom `Pool`, but if you honestly need hundreds of processes, it's an option. – ShadowRanger Sep 26 '21 at 12:36