I'm in a situation where I need to parallel process a very big numpy array (55x117x256x256). Trying to pass it around with the usual multiprocessing approach gives an AssertionError, which I understand to be because the array is too big to copy into each process. Because of this, I would like to try using shared memory with multiprocessing. (I'm open to other approaches, provided they aren't too complicated).
I've seen a few questions asking about the use of python multiprocessing's shared memory approach, e.g.
import numpy as np
import multiprocessing as mp
unsharedData = np.zeros((10,))
sharedData = mp.Array('d', unsharedData)
which seem to work fine. However, I haven't yet seen an example where this is done with a multidimensional array.
I've tried just sticking the multidimensional array into mp.Array
which gives me TypeError: only size-1 arrays can be converted to Python scalars
.
unsharedData2 = np.zeros((10,10))
sharedData2 = mp.Array('d', unsharedData2)## Gives TypeError
I can flatten the array, but I'd rather not if it can be avoided.
Is there some trick to get multiprocessing Array to handle multidimensional data?