1

I'm trying to use Joblib to parallelize some calculations in my code. Here's what I have so far:

data = {'tag': [], 'radials': []}

def append_radials(*args):
    for tag in range(len(sacla_tags)):
        det_data = get_detdata_for_single_tag(sacla_tags[tag], beamline, run)
        radial_profile = np.array(profiler.get_mean_profile(det_data))
        data['radials'].append(radial_profile)
        data['tag'].append(sacla_tags[tag])

args = [sacla_tags, beamline, run, profiler]
Parallel(n_jobs=5, verbose=50)(delayed(append_radials)(*args) for i in range(5))

In short, the append_radials function gathers some data, throws it into the radial_profile line, then appends the output to the data dictionary. This is iterating over a very large dataset, which takes ~15 minutes to complete. I want to use joblib to speed this up a bit.

The way the code is written, it loops through every tag (len(sacla_tags)=5000) in each job. Instead, I want it to split up the tags into say, 5 different job (1000 tags per job), then append the output to the global dictionary data.

Is this possible? What am I doing wrong here?


EDIT:

Here's a simpler example of my code.

def f(i): 
     dat = [] 
     q = i**2 * np.sqrt(i**(3/2)) 
     dat.append(q) 
     return dat 

x = np.linspace(1,10000000,1000000)
p = Parallel(n_jobs=5, verbose=50)(delayed(f)(x) for j in range(5))

The output of p here is:

In [37]: p                                                                                                                     
Out[37]: 
[[array([1.00000000e+00, 7.30854392e+02, 4.32617501e+03, ...,
         1.77826963e+19, 1.77827452e+19, 1.77827941e+19])],
 [array([1.00000000e+00, 7.30854392e+02, 4.32617501e+03, ...,
         1.77826963e+19, 1.77827452e+19, 1.77827941e+19])],
 [array([1.00000000e+00, 7.30854392e+02, 4.32617501e+03, ...,
         1.77826963e+19, 1.77827452e+19, 1.77827941e+19])],
 [array([1.00000000e+00, 7.30854392e+02, 4.32617501e+03, ...,
         1.77826963e+19, 1.77827452e+19, 1.77827941e+19])],
 [array([1.00000000e+00, 7.30854392e+02, 4.32617501e+03, ...,
         1.77826963e+19, 1.77827452e+19, 1.77827941e+19])]]

Each array in p is the same length, joblib didn't split up the range. Instead, it ran f(x) five times.

I know this is some basic concept I'm not quite grasping with how joblib works, but how would I tell it to, for example, split x up into 5 chunks and run those in parallel?

NoVa
  • 317
  • 3
  • 15

1 Answers1

2
from joblib import Parallel, delayed
# generate data
data = np.random.randint(0,10, size=(100))
# split data into as many chunks as desired (5 in this case)
split_data = np.array_split(data, 5)

# the function one wants to be parallelized 
def f(num):
    # calculation of interest
    q = num**2 * np.sqrt(num**(3/2)) 
    return q 
# run function over the 5 chunks in parallel
parallel_results = Parallel(n_jobs=10)(delayed(f)(s) for s in split_data)

This splits the data in (5) chunks, and runs the function you are interested in on all these junks on 10 cores (n_jobs) in parallel. This results in a list of 5 arrays (one array per chunk).

The (main) problem with your code is that you never use j in your parallel call.

If you really need to append a list take a look at this answer