I am trying to multithread a function that takes a lot of time when executed iteratively.
A simplifier version would be:
from multiprocessing import Process, Manager
def f(my_dict, key, list1, list2):
count = [0] * len(list1)
for i, val1 in enumerate(list1):
count[i] = sum(belongs_to(val1, val2) for val2 in list2)
my_dict[key] = (sum(count))
manager = Manager()
my_dict = manager.dict()
job = [Process(target=f, args=(my_dict, record, value, list2))
for record, value in other_dict.items()]
_ = [p.start() for p in job]
_ = [p.join() for p in job]
my_dict = {key: value for key, value in my_dict.items()}
When I run this code, my memory is overrun. Is there any easy way to limit the number of threads at the same time ?
Also, I am sharing the dictionary receiving the answers between all the threads thanks to Manager. Is there any way to share the lists given to the function f, as list2 is always the same ?