1

I just tried to apply multiprocessing to a loop that was written as list comprehension, as described here: How to parallelize list-comprehension calculations in Python?

The preliminaries work as they should:

>>> import multiprocessing
>>> try:
...     cpus = multiprocessing.cpu_count()
... except NotImplementedError:
...     cpus = 2   # arbitrary default
... 
>>> 
>>> def square(n):
...     return n * n
... 
>>> pool = multiprocessing.Pool(processes=cpus)
>>> cpus
12

Then, just to check, I'm not misunderstanding how map() works:

>>> map(square, range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

So, that would all look reasonable so far. But when I execute the line given by Mahmoud in the accepted answer linked above:

>>> print pool.map(square, range(10))
Process PoolWorker-1:
Process PoolWorker-2:
Process PoolWorker-12:
Process PoolWorker-6:
Process PoolWorker-9:
Process PoolWorker-4:
Process PoolWorker-8:
Process PoolWorker-10:
Process PoolWorker-11:
Traceback (most recent call last):
  File "C:\Program Files\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Program Files\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\lib\multiprocessing\process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Program Files\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\lib\multiprocessing\pool.py", line 102, in worker
    task = get()
  File "C:\Program Files\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\lib\multiprocessing\queues.py", line 376, in get
    return recv()
AttributeError: 'module' object has no attribute 'square'
Process PoolWorker-5:   

...and this takes the entire console with it. I've no idea why this would not work, and it seems like a very easy simple example, and 'square' is indeed defined and works, as the test with map() shows. Am I overlooking something so obvious that others don't even mention it? Or something version-specific?

I'm using Python 2.7.6 (Winpython 64, to be precise) on Windows 7 professional, and this happens in Spyder and in the stand-alone Python console.

HadeS
  • 2,020
  • 19
  • 36
Zak
  • 3,063
  • 3
  • 23
  • 30
  • 1
    here it is stated that this doesn't work in an interactive interpreter: https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers – user1514631 Oct 09 '15 at 12:29

1 Answers1

1

These are all the things which were wrong:

1: Interactive console

As user1514631 pointed out, multiprocessing cannot run in an interactive interpreter. This is painful for my way of programming (which involves writing half of the script in the interpreter, then pasting the correct code into a script)

2: multiprocessing.freeze_support()

This is required on Windows, otherwise I will have all CPUs generating warning messages about not using it, as soon as the worker pool is defined (way before I assign it anything to do)

3: if __name__ == "__main__":

I usually put this line in most of my code so I can use it as library or not, but this script was never going to be imported, so I did not have it. For some reason which I have not understood, even putting the multiprocessing.freeze_support() statement in the first line of a script will not work. It must apparently have a if __name__ == "__main__": line and be directly behind it, even if multiprocessing is only imported or used in some function.

The working code for Windows then looks like this:

def square(n):
    return n * n

if __name__ == "__main__":
    import multiprocessing
    multiprocessing.freeze_support()

    pool = multiprocessing.Pool(processes=5)

    print pool.map(square, range(10))   
Zak
  • 3,063
  • 3
  • 23
  • 30