4

I have a task which needs to run a number of subtasks, each on their own vm, and then when all subtasks are complete, merge the results and present them back to the caller.

I have implemented this using a multiprocessing.Pool, and it's working great.

I now want to scale up, running multiple of these tasks in parallel.

My initial design was to wrap my task running in another multiprocessing.Pool, where each task runs in its process, effectively fanning out as follows:

job
+----- task_a
|      +------ subtask_a1
|      +------ subtask_a2
|      +------ subtask_a3
+----- task_b
       +------ subtask_b1
       +------ subtask_b2
       +------ subtask_b3
  • job starts a multiprocessing.Pool with 2 processes, one for task_a and one for task_b.
  • in turn, task_a and task_b each start a multiprocessing.Pool with 3 processes, one for each of their subtasks.

When I tried to run my code, I hit an assertion error:

AssertionError: daemonic processes are not allowed to have children

Searching online for details, I found the following thread, an excerpt of which reads:

As for allowing children threads to spawn off children of its own using subprocess runs the risk of creating a little army of zombie 'grandchildren' if either the parent or child threads terminate before the subprocess completes and returns

I have also found workarounds which allow this kind of "pool within a pool" use:

class NoDaemonProcess(multiprocessing.Process):
    @property
    def daemon(self):
        return False

    @daemon.setter
    def daemon(self, value):
        pass

class NoDaemonContext(type(multiprocessing.get_context())):
    Process = NoDaemonProcess

class MyPool(multiprocessing.pool.Pool):
    def __init__(self, *args, **kwargs):
        kwargs['context'] = NoDaemonContext()
        super(MyPool, self).__init__(*args, **kwargs)

However, given the above quote about "zombie grandchildren", it seems perhaps this is not a good design.

So I guess my question is:

  • What is the pythonic way to "fan out" multiple processes within multiple processes"?
Steve Lorimer
  • 27,059
  • 17
  • 118
  • 213

0 Answers0