When running a large number of tasks (with large parameters) using Pool.apply_async, the processes are allocated and go to a waiting state, and there is no limit for the number of waiting processes. This can end up by eating all memory, as in the example below:
import multiprocessing
import numpy as np
def f(a,b):
return np.linalg.solve(a,b)
def test():
p = multiprocessing.Pool()
for _ in range(1000):
p.apply_async(f, (np.random.rand(1000,1000),np.random.rand(1000)))
p.close()
p.join()
if __name__ == '__main__':
test()
I'm searching for a way to limit the waiting queue, in such a way that there is only a limited number of waiting processes, and Pool.apply_async is blocked while the waiting queue is full.