0

Here is a reproducible code:

from multiprocessing import Process, Manager

manager = Manager()
shared_results_dict = manager.dict()

class WorkerProcess(Process):
    
    def __init__(self, shared_results_dict):
        super(WorkerProcess, self).__init__()
    
        self.shared_results_dict = shared_results_dict
        
    def run(self):
        self.shared_results_dict['a'] = 3
        
subproc = WorkerProcess(shared_results_dict)

subproc.daemon = True
subproc.start()

shared_results_dict['a']

The code above works fine when the start method for the multiprocessing is set as fork, but it fails to work when it is set to either forkserver or spawn. I thought Manager should work with whatever start method I use?

StatsNoob
  • 360
  • 1
  • 5
  • 15
  • IPython Notebooks have problems with spawn in general... if you want spawn, use a normal python interpreter which executes a .py file. The main problem is the lack of a consistent `__main__` file which must be imported by child processes when using spawn – Aaron Jun 04 '21 at 16:19

1 Answers1

0

If you are running in Jupyter Notebook, you need to put your Process subclass definition in a separate .py file and you will need to add a from multiprocessing import Process statement to that file. You will also need to put any code that creates subprocesses within a block controlled by if __name__ == '__main__':. Finally, you really need to wait for the subprocess to complete to be sure it has updated the dictionary if you are looking to print out an updated dictionary in the main process. Thus, it is pointless to use a daemon process:

File worker.py (for example)

from multiprocessing import Process

class WorkerProcess(Process):

    def __init__(self, shared_results_dict):
        super(WorkerProcess, self).__init__()

        self.shared_results_dict = shared_results_dict

    def run(self):
        self.shared_results_dict['a'] = 3

Your Jupyter Notebook Cell:

from multiprocessing import Manager
from worker import WorkerProcess

if __name__ == '__main__':
    manager = Manager()
    shared_results_dict = manager.dict()
    subproc = WorkerProcess(shared_results_dict)

    #subproc.daemon = True
    subproc.start()
    # wait for process to terminate to be sure the dict has been updated
    subproc.join()

    print(shared_results_dict['a'])

Prints:

3
Booboo
  • 38,656
  • 3
  • 37
  • 60
  • Thanks for the answer. I found sources explaining the need for __name__ condition. But why is it that the function needs to be written in a separate .py file and imported? What's the reason that functions written inside Jupyter fails? – StatsNoob Jun 05 '21 at 21:00
  • See the [documentation on multiprocessing](https://docs.python.org/3.9/library/multiprocessing.html), particularly the section that begins with `Note Functionality within this package requires that the __main__ module be importable by the children.` That whole section talks about creating subprocesses with the interactive interpreter but it pertains as well to Jupyter Notebook. Also, see specifically [Multiprocessing working in Python but not in iPython](https://stackoverflow.com/questions/23641475/multiprocessing-working-in-python-but-not-in-ipython/23641560#23641560). – Booboo Jun 05 '21 at 23:56