I have attempted to recreate the essence of my "real-world" problem using the small reproducible example below. This example attempts to leverage functionality I found here. The real-world example takes 16 days using a single core on my laptop, which has 16 cores, so I'm hopeful to cut my runtime down to one or two days given the majority of cores. First, however, I need to understand what I'm doing wrong with the small example below.
The example starts by setting up a list of tuples called all_combos
. The idea is to then pass each tuple within all_combos
to the function do_one_run()
. My goal is to parallelize do_one_run()
using mutliprocessing. Unfortunately the small reproducible example below kick back errors msgs that I'm unable to resolve. My suspicion is that I've misunderstood how the Pool
works, in particular mapping each tuple of parameters to the arguments of do_one_run()
, or perhaps I've misunderstood how to collect the output of do_one_run()
, or more likely both?
Any insights very much welcome!
import random
import numpy as np
import multiprocessing as mp
slns = {}
var1 = [5, 6, 7]
var2 = [2, 3, 4]
var3 = [10, 9, 8]
all_combos = []
key = 0
for v1 in var1:
for v2 in var2:
for v3 in var3:
all_combos.append([key, v1, v2, v3])
key += 1
def example_func(v1_passed, v2_passed, v3_passed):
tmp = np.random.random((v1_passed, v2_passed, v3_passed))*100
my_arr = tmp.astype(int)
piece_arr = my_arr[1,:,1:3]
return piece_arr
def do_one_run(key, v1_passed, v2_passed, v3_passed):
results = example_func(v1_passed, v2_passed, v3_passed)
slns.update({key: [v1_passed, v2_passed, v3_passed, results]})
pool = mp.Pool(4) # 4 cores devoted to job?
result = pool.starmap(do_one_run, all_combos)