6

I am using the following test code:

from pathos.multiprocessing import ProcessingPool as Pool
import numpy

def foo(obj1, obj2):
   a = obj1**2
   b = numpy.asarray(range(1,5))
   return obj1, b

if __name__ == '__main__':
    p = Pool(5)
    res = p.map(foo, [1,2,3], [4,5,6])

It gives error:

File "C:\Python27\lib\site-packages\multiprocess\pool.py", line 567, in get
    raise self._value
NameError: global name 'numpy' is not defined

What am I doing wrong in the code?

Edit: Why was this question voted down twice?

I have numpy installed and my interpreter has been using it correctly until I try to do it for multiprocessing. I have been coding with same install for a while.

pppery
  • 3,731
  • 22
  • 33
  • 46
Zanam
  • 4,607
  • 13
  • 67
  • 143
  • 1
    Please make sure you have numpy installed. If your answer is yes: Make sure you've installed it for the Python you're actually using. If your answer is yes: Try the `import numpy` inside your `foo` function. I think the question was downvoted because google returns a lot of answers if you ask it about your error. – MSeifert Aug 04 '16 at 19:29
  • 2
    I'm not sure if you have seen my edit to the comment. Maybe pathos doesn't know you imported numpy. Maybe you should put the `import numpy` inside your `foo` function. – MSeifert Aug 04 '16 at 19:39
  • How do I resolve this problem as for my real life example I have to import more than 10 packages and I can't be doing that in the function or is that the only way? – Zanam Aug 04 '16 at 20:18
  • 1
    does it work when you import it inside the function? – MSeifert Aug 04 '16 at 20:19
  • 1
    Yes it works when I do import inside the function. – Zanam Aug 04 '16 at 20:21

1 Answers1

4

It seems like imports are not shared between processes. Therefore you need to import numpy in all your processes seperatly.

In your case this means adding the import numpy in your foo function. Processes are not light-weight so the import won't slow you down (at least not significantly).

The other alternative would be to pass the module to the functions (not recommended and I'm not sure if that will work):

if __name__ == '__main__':
    p = Pool(5)
    res = p.map(foo, numpy, [1,2,3], [4,5,6])

def foo(np, obj1, obj2):
   a = obj1**2
   b = np.asarray(range(1,5))
   return obj1, b
MSeifert
  • 145,886
  • 38
  • 333
  • 352