I have a nested loop with a huge range of data. In some point it takes hours to calculated the values. I was wondering if somehow I can speed it up by using multiproccessing package of python. Here is my code:
def update_selections(all_selection):
selections_filtered_all = []
selections_filtered_all_minus_1 = []
for n, values in enumerate(all_selection):
items_set = set()
sum_length = 0
for y in values:
items_set.update(y)
sum_length += 1
if len(sum_length) == 300000000:
selections_filtered_all.append(1)
selections_filtered_all_minus_1.exted(selections_filtered_all)
By following this answer, this is my way, however it's not working:
def update_selections(all_selection):
selections_filtered_all = []
selections_filtered_all_minus_1 = []
pool = Pool()
for n, x in enumerate(all_selection):
pool.map(process_selections, x)
selections_filtered_all_minus_1.exted(selections_filtered_all)
def process_selections(values):
items_set = set()
sum_length = 0
for y in values:
items_set.update(y)
sum_length += 1
if len(sum_length) == 300000000:
selections_filtered_all.append(1)
return essences_set, sum_length, selections_filtered_all
all_selection = ['xfRxx', 'asdeEFD', ...]
update_selections(all_selection)
I don't understand how to bring pool() in a loop. Any suggestion would be appreciated