I read an old question Why does this python multiprocessing script slow down after a while? and many others before posting this one. They do not answer the problem I'm having.
IDEA OF THE SCRIPT. The script generates arrays, 256x256, in a serialised loop. Elements of an array are calculated one-by-one from a list that contains dictionaries with relevant params, one dictionary per an array element (256x256 in total per a list). The list is the way for me to enable parallel calculations.
THE PROBLEM. In the beginning, the generation of the data speeds up from a dozen seconds up-to a few seconds. Then, after a few iterations, it starts slowing down a fraction of a second with each new array generated to the point it takes forever to calculate anything.
Additional info.
- I am using a pool.map function. After making a few small changes to identify which element is being calculated, I also tried using map_async. Unfortunately, it is slower because I need to init the pool each time I finish calculating an array.
- When using the pool.map, I init the pool once before anything starts. In this way, I hope to save time initializing the pool in comparison to map_async.
- CPU shows low usage, up to ~18%.
- In my instance, a hard-drive isn't a bottleneck. All the data necessary for calculations is in RAM. I also do not save data onto a hard-drive keeping everything in RAM.
- I also checked if the problem persists if I use a different number of cores, 2-24. No changes either.
- I made some additional tests by running and terminating a pool, a. each time an array is generated, b. every 10 arrays. I noticed that in each case execution of the code slows down compared to the previous pool's execution time, i.e. if the previous slowed down to 5s, another one will be 5.Xs and so on. The only time the execution doesn't slow down is when I run the code serially.
- Working env: Windows 10, Python 3.7, conda 4.8.2, Spyder 4.
THE QUESTION: Why multiprocessing slows down after a while in the case where only CPU & RAM are involved (no hard-drive slowdown)? Any idea?
UPDATED CODE:
import multiprocessing as mp
from tqdm import tqdm
import numpy as np
import random
def wrapper_(arg):
return tmp.generate_array_elements(
self=arg['self'],
nu1=arg['nu1'],
nu2=arg['nu2'],
innt=arg['innt'],
nu1exp=arg['nu1exp'],
nu2exp=arg['nu2exp'],
ii=arg['ii'],
jj=arg['jj'],
llp=arg['self'].llp,
rr=arg['self'].rr,
)
class tmp:
def __init__(self, multiprocessing, length, n_of_arrays):
self.multiprocessing = multiprocessing
self.inshape = (length,length)
self.length = length
self.ll_len = n_of_arrays
self.num_cpus = 8
self.maxtasksperchild = 10000
self.rr = 0
"""original function is different, modified to return something"""
"""for the example purpose, lp is not relevant here but in general is"""
def get_ll(self, lp):
return [random.sample((range(self.length)),int(np.random.random()*12)+1) for ii in range(self.ll_len)]
"""original function is different, modified to return something"""
def get_ip(self): return np.random.random()
"""original function is different, modified to return something"""
def get_op(self): return np.random.random(self.length)
"""original function is different, modified to return something"""
def get_innt(self, nu1, nu2, ip):
return nu1*nu2/ip
"""original function is different, modified to return something"""
def __get_pp(self, nu1):
return np.exp(nu1)
"""dummy function for the example purpose"""
def dummy_function(self):
"""do important stuff"""
return
"""dummy function for the example purpose"""
def dummy_function_2(self, result):
"""do important stuff"""
return np.reshape(result, np.inshape)
"""dummy function for the example purpose"""
def dummy_function_3(self):
"""do important stuff"""
return
"""original function is different, modified to return something"""
"""for the example purpose, lp is not relevant here but in general is"""
def get_llp(self, ll, lp):
return [{'a': np.random.random(), 'b': np.random.random()} for ii in ll]
"""NOTE, lp is not used here for the example purpose but
in the original code, it's very important variable containg
relevant data for calculations"""
def generate(self, lp={}):
"""create a list that is used to the creation of 2-D array"""
"""providing here a dummy pp param to get_ll"""
ll = self.get_ll(lp)
ip = self.get_ip()
self.op = self.get_op()
"""length of args_tmp = self.length * self.length = 256 * 256"""
args_tmp = [
{'self': self,
'nu1': nu1,
'nu2': nu2,
'ii': ii,
'jj': jj,
'innt': np.abs(self.get_innt(nu1, nu2, ip)),
'nu1exp': np.exp(1j*nu1*ip),
'nu2exp': np.exp(1j*nu2*ip),
} for ii, nu1 in enumerate(self.op) for jj, nu2 in enumerate(self.op)]
pool = {}
if self.multiprocessing:
pool = mp.Pool(self.num_cpus, maxtasksperchild=self.maxtasksperchild)
"""number of arrays is equal to len of ll, here 300"""
for ll_ in tqdm(ll):
"""Generate data"""
self.__generate(ll_, lp, pool, args_tmp)
"""Create a pool of CPU threads"""
if self.multiprocessing:
pool.terminate()
def __generate(self, ll, lp, pool = {}, args_tmp = []):
"""In the original code there are plenty other things done in the code
using class' methods, they are not shown here for the example purpose"""
self.dummy_function()
self.llp = self.get_llp(ll, lp)
"""originally the values is taken from lp"""
self.rr = self.rr
if self.multiprocessing and pool:
result = pool.map(wrapper_, args_tmp)
else:
result = [wrapper_(arg) for arg in args_tmp]
"""In the original code there are plenty other things done in the code
using class' methods, they are not shown here for the example purpose"""
result = self.dummy_function_2(result)
"""original function is different"""
def generate_array_elements(self, nu1, nu2, llp, innt, nu1exp, nu2exp, ii = 0, jj = 0, rr=0):
if rr == 1 and self.inshape[0] - 1 - jj < ii:
return 0
elif rr == -1 and ii > jj:
return 0
elif rr == 0:
"""do nothing"""
ll1 = []
ll2 = []
"""In the original code there are plenty other things done in the code
using class' methods, they are not shown here for the example purpose"""
self.dummy_function_3()
for kk, ll in enumerate(llp):
ll1.append(
self.__get_pp(nu1) *
nu1*nu2*nu1exp**ll['a']*np.exp(1j*np.random.random())
)
ll2.append(
self.__get_pp(nu2) *
nu1*nu2*nu2exp**ll['b']*np.exp(1j*np.random.random())
)
t1 = sum(ll1)
t2 = sum(ll2)
result = innt*np.abs(t1 - t2)
return result
g = tmp(False, 256, 300)
g.generate()