0

I've made a minimalistic example that shows that python multiprocessing errors when wrapped inside a function for some reason. Is there any way around this?

This code works:

from multiprocessing import Pool
import time

def _foo(my_number):
  square = my_number * my_number
  time.sleep(1)
  return square 

with Pool() as p:
  r = list(p.imap(_foo, range(10)))
print(r)

This code doesn't:

from multiprocessing import Pool
import time

def test():
  def _foo(my_number):
    square = my_number * my_number
    time.sleep(1)
    return square 

  with Pool() as p:
    r = list(p.imap(_foo, range(30)))
  print(r)

test()

It errors out with the message AttributeError: Can't pickle local object 'test.<locals>._foo'.

There've been other questions related to this, but I'm pretty sure this doesn't infract any of those. In particular, Pool() is started after everything it needs (_foo).

So how should one call multiprocessing from within a function using something defined in that function?

chausies
  • 765
  • 7
  • 20
  • Because locally defined functions like that are not pickable, and multiprocessing requires things to be picklable. – juanpa.arrivillaga Apr 12 '19 at 18:48
  • 1
    Unfortunately, `pickle` seems to be baked in. There is the whole `dill`/`pathos` project that provides alternatives, see: https://stackoverflow.com/questions/19984152/what-can-multiprocessing-and-dill-do-together But you could also consider refactoring your code. – juanpa.arrivillaga Apr 12 '19 at 18:50
  • @juanpa.arrivillaga so is it just impossible to parallelize a for-loop that occurs inside a function? One can only parallelize for loops that occur in the global scope? That seems ridiculous. – chausies Apr 12 '19 at 18:51
  • Multiprocessing is for launching *multiple python processes*. These processes have to communicate, and `multiprocessing` chose `pickle` as the serialization format. It doesn't matter *where you call the `.map`* (if that's what you mean by "parallelizing a for loop" not a good way of thinking about things), what matters is what you are sending across the wire. In this case, that is `_foo`. Which cannot be pickled. Again, look at the `pathos` project, which implements a substitute for pickle that would work in this case. It would help to see what you are actually trying to accomplish. – juanpa.arrivillaga Apr 12 '19 at 18:53

0 Answers0