I am doing a function optimization using an evolutionary algorithm (CMAES). To run it faster I am using the multiprocessing module. The function I need to optimize takes large matrices as inputs (input_A_Opt, and input_B_Opt)
in the code below.
They are several GBs of size. When I run the function without multiprocessing, it works well. When I use multiprocessing there seems to be a problem with memory. If I run it with small inputs it works well, but when I run with the full input, I get the following error:
File "<ipython-input-2-bdbae5b82d3c>", line 1, in <module>
opt.myFuncOptimization()
File "/home/joe/Desktop/optimization_folder/Python/Optimization.py", line 45, in myFuncOptimization
**f_values = pool.map_async(partial_function_to_optmize, solutions).get()**
File "/usr/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
File "/usr/lib/python3.5/multiprocessing/pool.py", line 385, in _handle_tasks
put(task)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 206, in send
self._send_bytes(ForkingPickler.dumps(obj))
File "/usr/lib/python3.5/multiprocessing/connection.py", line 393, in _send_bytes
header = struct.pack("!i", n)
error: 'i' format requires -2147483648 <= number <= 2147483647
And here's a simplified version of the code (again, if I run it with the input 10 times smaller, all works fine):
import numpy as np
import cma
import multiprocessing as mp
import functools
import myFuncs
import hdf5storage
def myFuncOptimization ():
temp = hdf5storage.loadmat('/home/joe/Desktop/optimization_folder/matlab_workspace_for_optimization')
input_A_Opt = temp["input_A"]
input_B_Opt = temp["input_B"]
del temp
numCores = 20
# Inputs
#________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
P0 = np.array([ 4.66666667, 2.5, 2.66666667, 4.16666667, 0.96969697, 1.95959596, 0.44088176, 0.04040404, 6.05210421, 0.58585859, 0.46464646, 8.75751503, 0.16161616, 1.24248497, 1.61616162, 1.56312625, 5.85858586, 0.01400841, 1.0, 2.4137931, 0.38076152, 2.5, 1.99679872 ])
LBOpt = np.array([ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ])
UBOpt = np.array([ 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, ])
initialStdsOpt = np.array([2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, ])
minStdsOpt = np.array([ 0.030, 0.40, 0.030, 0.40, 0.020, 0.020, 0.020, 0.020, 0.020, 0.020, 0.020, 0.020, 0.020, 0.020, 0.020, 0.020, 0.020, 0.020, 0.050, 0.050, 0.020, 0.40, 0.020, ])
options = {'bounds':[LBOpt,UBOpt], 'CMA_stds':initialStdsOpt, 'minstd':minStdsOpt, 'popsize':numCores}
es = cma.CMAEvolutionStrategy(P0, 1, options)
pool = mp.Pool(numCores)
partial_function_to_optmize = functools.partial(myFuncs.func1, input_A=input_A_Opt, input_B=input_B_Opt)
while not es.stop():
solutions = es.ask(es.popsize)
f_values = pool.map_async(partial_function_to_optmize, solutions).get()
es.tell(solutions, f_values)
es.disp(1)
es.logger.add()
return es.result_pretty()
Any suggestions on how to solve this issue? am I not coding properly (new to python) or should I use other multiprocessing package like scoop?