I have a Python function example below which simply takes in a variable and performs a simple mathematical operation on it before returning.
If I parallelise this function, to better reflect the operation I would like to do in real life, and run the parallelised function 10 times, I notice on my IDE that the memory increases despite using the del results
line.
import multiprocessing as mp
import numpy as np
from tqdm import tqdm
def function(x):
return x*2
test_array = np.arange(0,1e4,1)
for i in range(10):
pool = mp.Pool(processes=4)
results = list(tqdm(pool.imap(function,test_array),total=len(test_array)))
results = [x for x in results if str(x) != 'nan']
del results
I have a few questions I would be grateful to know the answers to:
- Is there a way to prevent this memory increase?
- Is this memory loading due to the parallelisation process?