4

Say I have a huge immutable dataset represented as say a tuple. Lets say this dataset consumes much of the working memory so it is impossible to copy it.

is there a way in python to share that tuple with other python processes on the same machine, such that:

  1. the data does not need to be copied, neither wholly nor in small parts
  2. access to the data is fast and does not rely on IPC like sockets and pipes
  3. I dont have to represent the data as RAW shared memory - i.e. I can keep using it as tuples
  4. the representation maintains immutability semantics - i.e. I can't easily overwrite the memory and ruin computations
  5. ideally it would be cross platform, or at least windows + linux.
Aviad Rozenhek
  • 2,259
  • 3
  • 21
  • 42
  • It depends on your Operating System. – noxdafox Nov 15 '19 at 13:25
  • @noxdafox what would you have for linux or windows? – Aviad Rozenhek Nov 15 '19 at 15:22
  • AFAIK it is not possible - though I am always happy to be corrected. You can get some of what you want with v3.8's remarkable new **Shared Memory** https://docs.python.org/3.9/library/multiprocessing.shared_memory.html but mainly for Numpy arrays and some other aspects of what you want with **Redis**. I don't think you can get both. – Mark Setchell Nov 15 '19 at 15:38
  • CPython mutates reference counts every time you access any object (even intrinsically read-only access to tuples). That’s why all the shared-memory approaches involve non-Python data structures like monolithic Numpy arrays. – Davis Herring Nov 15 '19 at 17:43
  • I don't think you're going to find what you're looking for except for very basic ctypes or Numpy arrays. As an alternative you might consider using a database or for accessing (via a pipe) small portions of more complex shared python objects there a nice pattern [here](https://stackoverflow.com/questions/47837206/sharing-a-complex-python-object-in-memory-between-separate-processes/48326778#48326778) – bivouac0 Nov 16 '19 at 03:03
  • You might want to look at [POSH](http://poshmodule.sourceforge.net). Unfortunately it looks like the code here is a stale and doesn't compile under python3 but it appears to work under python2. I'd be very interested if this works for you as I sometimes have similar needs for large data. – bivouac0 Nov 16 '19 at 05:52

0 Answers0