I am wishing to use multiprocessing where one of the arguments is a very large numpy array. I’ve researched some other posts appearing to have similar issues
Large numpy arrays in shared memory for multiprocessing: Is sth wrong with this approach?
Share Large, Read-Only Numpy Array Between Multiprocessing Processes
but being rather new to python, I've been having trouble adapting the solutions to this template and in this form. I wonder if I could ask for your help to understand what my options are in order to convey X to the functions in a read-only manner. My simplified snippet of code is here:
import multiprocessing as mp
import numpy as np
def funcA(X):
# do something with X
print 'funcA OK'
def funcB(X):
# do something else with X
print 'funcB OK'
if __name__=='__main__':
X=np.random.rand(int(5.00e8),)
funcA(X) # OK
funcB(X) # OK
X=np.random.rand(int(2.65e8),)
P=[]
P.append(mp.Process(target=funcA,args=(X,))) # OK
P.append(mp.Process(target=funcB,args=(X,))) # OK
for p in P:
p.start()
for p in P:
p.join()
X=np.random.rand(int(2.70e8),)
P=[]
P.append(mp.Process(target=funcA,args=(X,))) # FAIL
P.append(mp.Process(target=funcB,args=(X,))) # FAIL
for p in P:
p.start()
for p in P:
p.join()
funcA and funcB appear to accept very large numpy arrays when invoked sequentially. However, if they are invoked as multiprocesses, then there appears to be an upper size limit to the size of the numpy array that can be passed to the function. How could I best get around this?
Note:
0) I do not wish to modify X; only to read from it;
1) I’m running on 64-bit Windows 7 Professional