4

I am trying to nest multiprocessing pool within another multiprocessing pool. Both levels need to be process pools, thread pools are not required for my purposes.

However all 5 of my attempts using multiprocssing or pathos have failed

Attempt #1:

from multiprocessing import Pool

def foo(a, b):
    return Pool(3).map(bar, range(a+b))

def bar(x):
    return Pool(3).map(baz, range(x*2))

def baz(z):
    return z**2

if __name__ == '__main__':
    results = Pool(3).starmap(foo, list(zip(range(0,10,2), range(1,10,2))))
    print(results)

gave the error

AssertionError: daemonic processes are not allowed to have children


Attempt #2:

from pathos.multiprocessing import ProcessPool

def foo(a, b):
    return ProcessPool(3).map(bar, range(a+b))

def bar(x):
    return ProcessPool(3).map(baz, range(x*2))

def baz(z):
    return z**2

if __name__ == '__main__':
    results = ProcessPool(3).map(foo, range(0,10,2), range(1,10,2))
    print(results)

gave the error

AssertionError: daemonic processes are not allowed to have children


Attempt #3:

from pathos.parallel import ParallelPool

def foo(a, b):
    return ParallelPool(nodes=3).map(bar, range(a+b))

def bar(x):
    return ParallelPool(nodes=3).map(baz, range(x*2))

def baz(z):
    return z**2

if __name__ == '__main__':
    results = ParallelPool(nodes=3).map(foo, range(0,10,2), range(1,10,2))
    print(results)

gave the error

NameError: name 'AbstractWorkerPool' is not defined


Attempt #4:

Based on Python Process Pool non-daemonic?

import multiprocessing

class NonDaemonPool(multiprocessing.Pool):
    def Process(self, *args, **kwds):
        proc = super(NonDaemonPool, self).Process(*args, **kwds)

        class NonDaemonProcess(proc.__class__):
            """Monkey-patch process to ensure it is never daemonized"""

            @property
            def daemon(self):
                return False

            @daemon.setter
            def daemon(self, val):
                pass

        proc.__class__ = NonDaemonProcess

        return proc


def foo(a, b):
    return NoDaemonPool(3).map(bar, range(a+b))

def bar(x):
    return NoDaemonPool(3).map(baz, range(x*2))

def baz(z):
    return z**2

if __name__ == '__main__':
    results = NoDaemonPool(3).starmap(foo, list(zip(range(0,10,2), range(1,10,2))))
    print(results)

gave the error

class NonDaemonPool(multiprocessing.Pool):

TypeError: method expected 2 arguments, got 3


Attempt #5:

Based on Python Process Pool non-daemonic?

import multiprocessing

class NoDaemonProcess(multiprocessing.Process):
    @property
    def daemon(self):
        return False

    @daemon.setter
    def daemon(self, value):
        pass


class NoDaemonContext(type(multiprocessing.get_context())):
    Process = NoDaemonProcess


class NonDaemonPool(multiprocessing.Pool):
    def __init__(self, *args, **kwargs):
        kwargs['context'] = NoDaemonContext()
        super(NonDaemonPool, self).__init__(*args, **kwargs)


def foo(a, b):
    return NoDaemonPool(3).map(bar, range(a+b))

def bar(x):
    return NoDaemonPool(3).map(baz, range(x*2))

def baz(z):
    return z**2

if __name__ == '__main__':
    results = NoDaemonPool(3).starmap(foo, list(zip(range(0,10,2), range(1,10,2))))
    print(results)

gave the error

class NonDaemonPool(multiprocessing.Pool):

TypeError: method expected 2 arguments, got 3

Any advice to achieve multiprocessing pool nested within multiprocessing pool is greatly appreciated! Usage of pathos is not required, although pathos appears to already support nested/hierarchical multiprocessing maps so I believe it is the easier solution than Python 3's built-in multiprocessing module.

Using Python 3.8.0 and pathos 0.2.5 on Mac OS X Catalina 10.15.2

Nyxynyx
  • 61,411
  • 155
  • 482
  • 830
  • Is there a strict reason to nest them? It might be simpler to flatten out the items which you are mapping over and using a single pool. – bnaecker Feb 14 '20 at 03:43
  • @bnaecker The functions being nested (which create their own multiprocessing pool) have already been written and should ideally not be modified much, so I initially think its more straightforward to nest them (than to modify them in order to be able to use a single pool with the inputs flattened out as you have suggested) – Nyxynyx Feb 14 '20 at 03:52
  • A couple of comments. First, there's an error with the example code you've posted. The argument at the first `starmap` is `[(0, 1), (2, 3), ...]`. Which means `foo` will first be called as `foo(0, 1)`. The statements inside will raise `TypeError: zip argument #1 must support iteration`, because `zip(a * b, a)` isn't valid as the arguments to `zip` are integers, which don't support iteration. – bnaecker Feb 14 '20 at 19:52
  • Second, it looks like you can use a `concurrent.futures.ProcessPoolExecutor` at the highest level to call `foo` in parallel. Unfortunately, that seems to just push the problem back, in that the next nested `Pool` inside `bar` fails with the same error. Without being able to modify the code in `foo`/`bar`/`baz`, it seems like the `concurrent.futures` approach is a non-starter. – bnaecker Feb 14 '20 at 19:54
  • It might be possible to write a decorator, which sets the multiprocessing context to create non-daemon Processes, and then going on calling your functions `foo`. You'll run into errors about the local decorated function inside the decorator not being picklable, but there may be a way around that by using `copyreg` to tell `pickle` how to serialize/deserialize the decorated function. – bnaecker Feb 14 '20 at 20:26
  • @bnaecker Thanks for catching the error, I have updated my example codes. – Nyxynyx Feb 14 '20 at 20:27

0 Answers0