I'm trying to parallize a Python code and I'm having some problems with calling the operation as a function due to large constant data.
My non-parallel code goes something like this...
ps is a list of points
result = array of len(ps)
compute a,b,c being large numpy array independent of p
for p in ps:
compute some stuff dependant on p, a,b,c
return scalar to result[i]
Wrapping that in a function with not computing the large arrays a,b,c every time leaves me with
def calc(p,a,b,c):
compute some stuff dependant on p, a,b,c
return scalar
which I then call in parallel using some common python modules by
results = Parallel(n_jobs=num_cores)(delayed(calc)(p,a,b,c) for p in ps)
testing on different cores actually speeds up the proccess on a scale somewhat expected (cutting time by a factor of 3 on 4 cores is what i get..) BUT running just on one core is so much slower than the old loop version that the speedup just dosn't cut it.
Im blaming that to data exchange - so is there a way to get around that restriction?