I have a sample code here that uses a global variable, and its giving me errors. The global variable x
is declared in test3
function before calling test2
function, but the test2
function doesn't appear to get the definition of the global variable x
from multiprocessing import Pool
import numpy as np
global x
def test1(w, y):
return w+y
def test2(v):
global x # x is assigned value in test3 before test2 is called
return test1(x, v)
def test3():
global x
x = 2
y = np.random.random(10)
with Pool(processes=6) as p:
z = p.map(test2, y)
print(z)
if __name__ == '__main__':
test3()
The error is:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\WinPython-64bit-3.5.2.1Qt5\python-3.5.2.amd64\lib\multiprocessing\pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "C:\WinPython-64bit-3.5.2.1Qt5\python-3.5.2.amd64\lib\multiprocessing\pool.py", line 44, in mapstar
return list(map(*args))
File "...\my_global_variable_testcode.py", line 23, in test2
return test1(x, v)
NameError: name 'x' is not defined
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "...\my_global_variable_testcode.py", line 35, in <module>
test3()
File "...\my_global_variable_testcode.py", line 31, in test3
z = p.map(test2, y)
File "C:\WinPython-64bit-3.5.2.1Qt5\python-3.5.2.amd64\lib\multiprocessing\pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\WinPython-64bit-3.5.2.1Qt5\python-3.5.2.amd64\lib\multiprocessing\pool.py", line 608, in get
raise self._value
NameError: name 'x' is not defined
I have looked at a lot of questions and answers on SO, but still haven't been able to figure out how to fix this code. Would be grateful if someone can point out what is the issue with the code?
Can anyone show me how to rewrite the code above, without changing the basic structure of code (i.e. retaining test1
, test2
, test3
as 3 separate functions, as in my original code these functions are quite long and complex) so that I can achieve my goal of multi-processing?
p.s. This sample code is just a simplified version of my actual code, and I am giving this simplified version here to figure out how to make global variables work (not trying to find a complicated way for 2+np.random.random(10)
).
* EDIT * - BOUNTY DESCRIPTION
This bounty is for someone who can help me re-write this code, preserving the basic structure of functions in the code:
(i) test1
does the multi-processing call to test2
, and test2
in turn calls test3
(ii) makes use of either global variables or the Manager class of multiprocessing module or anything else to avoid having test1
pass common variables to test2
(iii) test1
also gives some values or makes changes to the global variables / common data before calling the multiprocessing code
(iv) Code should work on Windows (as i am using Windows). Not looking for a solution that works on Linux / OSX at this time.
To help with the bounty, let me give two different test cases.
* case 1 - non-multiprocessing version *
import numpy as np
x = 3
def test1(w, y):
return w+y
def test2(v):
global x
print('x in test2 = ', x)
return test1(x, v)
def test3():
global x
x = 2
print('x in test3 = ', x)
y = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
z = test2(y)
print(z)
if __name__ == '__main__':
test3()
The output (correct) is:
x in test3 = 2
x in test2 = 2
[ 3 4 5 6 7 8 9 10 11 12]
* case 2 - multi-processing version *
from multiprocessing import Pool
import numpy as np
x = 3
def test1(w, y):
return w+y
def test2(v):
global x
print('x in test2 = ', x)
return test1(x, v)
def test3():
global x
x = 2
print('x in test3 = ', x)
y = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
with Pool(processes=6) as p:
z = p.map(test2, y)
print(z)
if __name__ == '__main__':
test3()
The output (incorrect) is
x in test3 = 2
x in test2 = 3
x in test2 = 3
x in test2 = 3
x in test2 = 3
x in test2 = 3
x in test2 = 3
x in test2 = 3
x in test2 = 3
x in test2 = 3
x in test2 = 3
[4, 5, 6, 7, 8, 9, 10, 11, 12, 13]