I pass large scipy.sparse arrays to a parallel processes on shared memory of one computing node. In each round of parallel jobs, the passed array will not be modified. I want to pass the array with zero-copy.
While this is possible with multiprocessing.RawArray() and numpy.sharedmem (see here), I am wondering how ray's put() works.
As far as I understood (see memory management, [1], [2]), ray's put() copies the object once and for all (serialize, then de-serialize) to the object store that is available for all processes.
Question:
I am not sure I understood it correctly, is it a deep copy of the entire array in the object store or just a reference to it? Is there a way to "not" copy the object at all? Rather, just pass the address/reference of the existing scipy array? Basically, a true shallow copy without the overhead of copying the entire array.
Ubuntu 16.04, Python 3.7.6, Ray 0.8.5.