I'm using multiprocessing.Queue
to pass numpy arrays of float64
between python processes. This is working fine, but I'm worried it may not be as efficient as it could be.
According to the documentation of multiprocessing
, objects placed on the Queue
will be pickled. calling pickle
on a numpy array results in a text representation of the data, so null bytes get replaced by the string "\\x00"
.
>>> pickle.dumps(numpy.zeros(10))
"cnumpy.core.multiarray\n_reconstruct\np0\n(cnumpy\nndarray\np1\n(I0\ntp2\nS'b'\np3\ntp4\nRp5\n(I1\n(I10\ntp6\ncnumpy\ndtype\np7\n(S'f8'\np8\nI0\nI1\ntp9\nRp10\n(I3\nS'<'\np11\nNNNI-1\nI-1\nI0\ntp12\nbI00\nS'\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00'\np13\ntp14\nb."
I'm concerned that this means my arrays are being expensively converted into something 4x the original size and then converted back in the other process.
Is there a way to pass the data through the queue in a raw unaltered form?
I know about shared memory, but if that is the correct solution, I'm not sure how to build a queue on top of it.
Thanks!