12

I'm trying to run some sample code of the multiprocessing.pool module of python, found in the web. The code is:

def square(x):
    return x * x
if __name__ == '__main__':
    pool = Pool(processes=4)
    inputs = [0, 1, 2, 3, 4]
    outputs = pool.map(square, inputs)

But when i try to run it, it never finsh the execution and i have to restart the kernel of my IpythonNotebook notebook. What's the problem?

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
Duccio Bertieri
  • 137
  • 1
  • 1
  • 7
  • That code works in regular (non-Notebook) Python. Can you confirm that it works on your system outside the Notebook? Can you also replace the Pool part with regular `map()` to confirm that works inside the Notebook? See also http://stackoverflow.com/a/23641560/4323 – John Zwinck Dec 04 '15 at 10:29
  • works for me. You may want to add more details e.g. the version of `ipython`, OS, etc. – cel Dec 04 '15 at 11:02
  • Does this answer your question? [python multiprocess don't finish properly](https://stackoverflow.com/questions/13395636/python-multiprocess-dont-finish-properly) – P_Sta May 11 '21 at 19:49

1 Answers1

21

As you may read from the answer pointed out by John in the comments, multiprocessing.Pool, in general, should not be expected to work well within an interactive interpreter. To understand why it is the case, consider how Pool does its job:

  • It forks python workers, passing to them the name of the current Python file.
  • The workers then essentially do import <this file>, and listen for messages from the master.
  • The master sends function names along with function arguments to the workers via pickling. Note that functions themselves cannot be sent, because the pickle protocol does not allow that.

When you try to perform this procedure from an interactive prompt, there is no reasonable "current Python file" to pass to the children for importing. Moreover, the functions you defined in your interactive prompt are not part of any module (they are dynamically defined), and hence cannot be imported by the children from that nonexistent module. So your easiest bet is to simply avoid using multiprocessing within IPython. IPython parallel is so much better anyway :)


For completeness' sake I also checked what exactly happens in my particular case of an IPython 4 running under Python 2.7 on Windows 8 (where I can observe the interpreter getting stuck as well). Interestingly, the reason IPython gets stuck in the first place is not one of those mentioned above.

It turns out that multiprocessing checks whether __main__.__file__ is defined, and if not, sends sys.argv[0] as the "current filename" to the children. In the case of (my version of) IPython sys.argv[0] is equal to C:\Dev\Anaconda\lib\site-packages\ipykernel\__main__.py.

Unfortunately, the worker processes before starting up happen to check whether the file they are going to import is already in their sys.modules. Line 488 of multiprocessing/forking.py says:

assert main_name not in sys.modules, main_name

When the main_name is __main__ (as is the case with ipython's workers) this assertion fails and the workers fail to start. The same code, however, is "smart" enough to check whether the passed name is ipython, in which case it does no such checks nor does not import anything.

Consequently, the problem of workers failing to start could be solved using an ugly hack of defining __main__.__file__ to be equal to ipython. The following code does work fine from an IPython cell:

import sys
sys.modules['__main__'].__file__ = 'ipython'
from multiprocessing import Pool

pool = Pool(processes=4)
inputs = [0, 1, 2, 3, 4]
outputs = pool.map(abs, inputs)

Note that this example asks the workers to compute abs, a built-in function. It would fail (gracefully, with an exception) if you asked the workers to compute a function you defined within the notebook.

It turns out you can, in principle, go further with the hacking and have your functions sent over to the workers using some manual pickling of their code. You can find a pretty cool example of such a hack here.

Community
  • 1
  • 1
KT.
  • 10,815
  • 4
  • 47
  • 71