0

I'm using joblib with Dask to parallellize my code that has the following loop structure:

def main():
    for semtype in semtypes:
        test = get_valid_systems(systems, semtype)
        expressions = get_ensemble_pairs(test)
    
        for c in expressions:

            <do stuff>

The first attempt was to rewrite it with the inner loop as:

if __name__ == '__main__':

    for semtype in semtypes:
        test = get_valid_systems(systems, semtype)
        expressions = get_ensemble_pairs(test)

        print('SYSTEMS FOR SEMTYPE', semtype, 'ARE', test)
    
        with joblib.parallel_backend('dask'):
            joblib.Parallel(verbose=10)(joblib.delayed(main)(c) for c in expressions)

which work fine.

Now, I'd like to add both loops, as in:

with joblib.parallel_backend('dask'):

    joblib.Parallel(verbose=100)(joblib.delayed(main)(semtype, c) for c in get_ensemble_pairs(get_valid_systems(systems, semtype)) for semtype in semtypes)

However, I'm getting an error that name 'semtype' is not defined. I'm assuming this is a scoping issue wrt the function calls in my Paraallel statement. I'm not quite sure how to deal with this?

horcle_buzz
  • 2,101
  • 3
  • 30
  • 59

1 Answers1

2

The outer most loop should come first.

with joblib.parallel_backend('dask'):

    joblib.Parallel(verbose=100)(joblib.delayed(main)(semtype, c) for semtype in semtypes for c in get_ensemble_pairs(get_valid_systems(systems, semtype)))
erncnerky
  • 394
  • 3
  • 10
  • Well, that was embarrassing! I was going by this: [how-to-use-nested-loops-in-joblib-library-in-python](https://stackoverflow.com/questions/36423975/how-to-use-nested-loops-in-joblib-library-in-python?rq=1) – horcle_buzz Jan 24 '21 at 23:44