Does the scope of a numpy ndarray function differently within a function called by multiprocessing? Here is an example:
Using python's multiprocessing module I am calling a function like so:
for core in range(cores):
#target could be f() or g()
proc = mp.Process(target=f, args=(core))
jobs.append(proc)
for job in jobs:
job.start()
for job in jobs:
job.join()
def f(core):
x = 0
x += random.randint(0,10)
print x
def g(core):
#Assume an array with 4 columns and n rows
local = np.copy(globalshared_array[:,core])
shuffled = np.random.permutation(local)
Calling f(core)
, the x
variable is local to the process, ie. it prints a different, random integer as expected. These never exceed 10, indicating that x=0
in each process. Is that correct?
Calling g(core)
and permuting a copy of the array returns 4 identically 'shuffled' arrays. This seems to indicate that the working copy is not local the child process. Is that correct? If so, other than using sharedmemory space, is it possible to have an ndarray be local to the child process when it needs to be filled from shared memory space?
EDIT:
Altering g(core)
to add a random integer appears to have the desired effect. The array's show a different value. Something must be occurring in permutation
that is randomly ordering the columns (local to each child process) the same...ideas?
def g(core):
#Assume an array with 4 columns and n rows
local = np.copy(globalshared_array[:,core])
local += random.randint(0,10)
EDIT II:
np.random.shuffle
also exhibits the same behavior. The contents of the array are shuffling, but are shuffling to the same value on each core.