1

Fellow co-worker and I have run into an issue on our Macs (His: Intel, Mine: M1). I'm on 12.5.1 Monterey (not sure of his).

When using Python 3.7 and implementing the following code, all works as expected:

Toy Example

from concurrent.futures import ProcessPoolExecutor

def foo(a, b=0):
  return a + b

with ProcessPoolExecutor(max_workers=4) as executor:
  future = executor.submit(foo, 1, b=2)
  print(future.result())

# prints "3"

BUT when I use Python 3.8 - 3.10, I get an error trace that looks like:

Process SpawnProcess-1:
Traceback (most recent call last):
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/concurrent/futures/process.py", line 237, in _process_worker
    call_item = call_queue.get(block=True)
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/multiprocessing/queues.py", line 122, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'foo' on <module '__main__' (built-in)>
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

If we fire up a Docker python:3.10-slim and execute the same code on the Mac, it works great in the container.

Can't find any concrete question or evidence that others have run into this problem, but this toy example fails on both our Macs. Seems like it has troubles finding the definition of the foo function. Originally ran into this problem with Pebble, but have found it in the builtin library now.

Any history of problems with Mac Python 3.8+ and concurrent.futures?

More Detailed Example

It was pointed out that you can check for __main__ in the toy example above, so I am including another example, using Pebble, that works great everywhere, except Mac Python 3.8+ where it throws the same sort of error. This is how I use Pebble in my code, but breaks when I use the later Python, only on a Mac:

from pebble import concurrent


class Foo:
    def __init__(self, timeout):
        self.timeout = timeout

    def do_math(self, a, b):
        # Define our task function
        @concurrent.process(timeout=self.timeout)
        def bar(a, b=0):
            return a + b

        future = bar(a, b)
        return future.result()


if __name__ == "__main__":
    foo = Foo(timeout=5)
    print(foo.do_math(2, 3))
    # Prints 5, except on Mac Python 3.8+

Again, on Mac Python 3.8+ (only) it throws this error:

pebble.common.RemoteTraceback: Traceback (most recent call last):
  File "/Users/user/Projects/temp/venv/lib/python3.10/site-packages/pebble/concurrent/process.py", line 205, in _function_lookup
    return _registered_functions[name]
KeyError: 'Foo.do_math.<locals>.bar'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/user/Projects/temp/venv/lib/python3.10/site-packages/pebble/common.py", line 174, in process_execute
    return function(*args, **kwargs)
  File "/Users/user/Projects/temp/venv/lib/python3.10/site-packages/pebble/concurrent/process.py", line 194, in _trampoline
    function = _function_lookup(name, module)
  File "/Users/user/Projects/temp/venv/lib/python3.10/site-packages/pebble/concurrent/process.py", line 209, in _function_lookup
    function = getattr(mod, name)
AttributeError: module '__mp_main__' has no attribute 'Foo.do_math.<locals>.bar'
XPlatform
  • 45
  • 8

2 Answers2

2

Python 3.8 changed the default multiprocessing startmethod on Mac from fork to spawn, because forking was leading to crashes. (Fork-without-exec is just very precarious in general, and it can cause problems on non-Mac systems too, but Mac system frameworks in particular do not play well with forking.)

Your code is unsafe to use with the spawn startmethod. In the first example, this is because you're missing an if __name__ == '__main__' guard. In the second example, it's because you're using a nested function, which cannot be loaded by the worker process.


You need to make your code spawn-safe. Add if __name__ == '__main__' guards, stop trying to run nested functions in worker processes, and fix whatever else you might be doing that doesn't work with spawn.

You could try passing a fork context to pebble:

import multiprocessing

@concurrent.process(timeout=self.timeout, context=multiprocessing.get_context('fork'))
def bar(a, b=0):
    ...

but there's a good reason the default was changed. Using fork on Mac is likely to lead to weird crashes. If you're lucky, it'll crash immediately. If you're unlucky, you'll get an urgent call at 3 in the morning on a Saturday 5 months from now, when you've forgotten all about this and you have to figure out the problem from scratch.

user2357112
  • 260,549
  • 28
  • 431
  • 505
0

Your code will run successfully if you add if __name__ == "__main__" :

from concurrent.futures import ProcessPoolExecutor

def foo(a, b=0):
  return a + b

if __name__ == '__main__':
  with ProcessPoolExecutor(max_workers=4) as executor:
    future = executor.submit(foo, 1, b=2)
    print(future.result())

# prints "3"
Philippe
  • 20,025
  • 2
  • 23
  • 32
  • Unfortunately this doesn't work either. But it's really two problems: 1) the code suggested still throws the same error, but more importantly 2) This is just a toy example of a real use case. I just boiled down the problem into the simplest example I could just to highlight the real problem: Python 3.8+ on Mac (only) has a problem with it's `concurrent.futures` implementation. `Python 3.10.2 (main, Oct 4 2022, 16:52:15) [Clang 14.0.0 (clang-1400.0.29.102)] on darwin` `...` `AttributeError: Can't get attribute 'foo' on ` – XPlatform May 22 '23 at 13:57
  • This wasn't run as a script. It's in the interactive Python console. I can post a trivial toy example of using this same paradigm in a nested class, which is my real issue... but the replication of the issue is still the same, something is broken/inconsistent with only the Mac version 3.8+ of this, not the Docker/Linux versions. – XPlatform May 22 '23 at 15:26
  • When you ran my script, did you get ANY errors ? – Philippe May 22 '23 at 15:32
  • Nope! Ran the toy example decently. Unfortunately I don't know what that says about the errant behavior, especially when used in a class? I could post another example, perhaps that'd illustrate it where it's used in class? I thought my toy example above was good for pinpointing the behavior on Mac Python 3.8+ – XPlatform May 25 '23 at 12:58
  • https://stackoverflow.com/users/2125671/philippe I added an example of how I'm using the class in code. It's again a toy example, but I believe it illustrates the same error in a more meaningful way. Sorry if I wasn't clear in my original post! – XPlatform May 25 '23 at 13:20