I have a quick question regarding multiprocessing in python.
I am conducting a rather large grid search over three parameters and the computation is taking ~14 hours to complete. I would like to shrink this run time down by using multiprocessing.
A very simplified example of my code is here:
import numpy as np
import pickle
import time
a_range = np.arange(14, 18, 0.2)
b_range = np.arange(1000, 5000, 200)
c_range = np.arange(12, 21, .5)
a_position = range(len(a_range))
b_position = range(len(b_range))
c_position = range(len(c_range))
data_grid = np.zeros([len(a_range), len(b_range), len(c_range)])
record_data = []
start_time = time.time()
for (a,apos) in zip(a_range, a_position):
for (b, bpos) in zip(b_range, b_position):
for (c, cpos) in zip(c_range, c_position):
example = a+b+c #The math in my model is much more complex and takes
#about 7-8 seconds to process
data_grid[apos, bpos, cpos] = example
record_data.append([a, b, c, example])
with open('Test_File', 'wb') as f:
pickle.dump(record_data, f)
np.save('example_values', data_grid)
print 'Code ran for ', round(time.time()-start_time,2), ' seconds'
Now, I have absolutely zero experience in multiprocessing so my first attempt at this was changing the for loops into a function and then calling the multiprocessing function like this:
def run_model(a, b, c, apos, bpos, cpos):
example=a+b+c
data_grid[apos, bpos, cpos]=example
record_data.append([a, b, c, example])
from multiprocessing import Pool
if __name__=='__main__':
pool=Pool(processes=4)
pool.map(run_model, [a_range, b_range, c_range, a_position, b_positon, c_positon])
pool.close()
pool.join()
This failed however at the pool.map call. I understand this function only takes a single iterable argument but I don't know how to fix the problem. I am also skeptical that the data_grid variable is going to be filled correctly. The result I want from this function is two files saved, one as an array of values whose indexes correspond to a, b, and c values and the last a list of lists containing the a, b, c values and the resulting value (example in the code above)
Thanks for any help!
-Will