Python3 parallel with large constant data

Question

I'm trying to parallize a Python code and I'm having some problems with calling the operation as a function due to large constant data.

My non-parallel code goes something like this...

ps is a list of points  
result = array of len(ps)  
compute a,b,c being large numpy array independent of p  
for p in ps:  
    compute some stuff dependant on p, a,b,c  
    return scalar to result[i]

Wrapping that in a function with not computing the large arrays a,b,c every time leaves me with

def calc(p,a,b,c):  
   compute some stuff dependant on p, a,b,c  
   return scalar

which I then call in parallel using some common python modules by

results = Parallel(n_jobs=num_cores)(delayed(calc)(p,a,b,c) for p in ps)

testing on different cores actually speeds up the proccess on a scale somewhat expected (cutting time by a factor of 3 on 4 cores is what i get..) BUT running just on one core is so much slower than the old loop version that the speedup just dosn't cut it.
Im blaming that to data exchange - so is there a way to get around that restriction?

What module are you using for parallelisation? You may be limited by the [GIL](https://wiki.python.org/moin/GlobalInterpreterLock). — s16h, Sep 21 '16 at 09:52
If you are using joblib (as it seems to be the case), look at this answer http://stackoverflow.com/a/21029356/2642845 — Thomas Moreau, Sep 22 '16 at 05:59

Python3 parallel with large constant data

0 Answers0