3

I'm trying to make a fast library for interprocess communication between any combination of Python and C/C++ processes. (i.e. Python <-> Python, Python <-> C++, or C++ <-> Python)

In the hopes of having the fastest implementation, I'm trying to utilize shared memory using mmap. The plan is for two processes to share memory by "mmap-ing" the same file and read from and write to this shared memory to communicate.

I want to avoid any actual writes to a real file, and instead simply want to use a filename as a handle for the two processes to connect. However, I get hung up on the following call to mmap:

self.memory = mmap.mmap(fileno, self.maxlen)

where I get the following error:

FileNotFoundError: [Errno 2] No such file or directory: 'shared_memory_file'

or if I make an empty file:

ValueError: mmap length is greater than file size

Do I need to simply make an empty file filled with nulls in order to be able to use shared memory like this?

How can I use mmap for shared memory in Python between unrelated processes (not parent<->child communication) in a way which C++ can also play along? (not using multiprocessing.shared_memory)

JacKeown
  • 2,780
  • 7
  • 26
  • 34
  • If you want to have a shared memory backed by physical file, you have to have a file, and the file has to be of the adequate size. If you do not need physical file to back your memory, and instead just want to have a shared segment, you can use functions which create shared memory without files, either SysV `shmget` and friends, or Posix `shm_open` and their friends respectively. – SergeyA Feb 12 '21 at 18:32
  • @SergeyA, can you link me to a good resource for doing this in Python? Is there a way to do this that's cross-platform? – JacKeown Feb 12 '21 at 18:37

1 Answers1

3

To answer the questions directly as best I can:

  • The file needs to be sized appropriately before it can be mapped. If you need more space, there are different ways to do it ... but most portable is likely unmap the file, resize the file on disk, and then remap the file. See: How to portably extend a file accessed using mmap()

  • You might be able to mmap with MAP_ANONYMOUS|MAP_SHARED, then fork, then run with the same shared memory in both processes. See: Sharing memory between processes through the use of mmap()

  • Alternatively, you could create a ramdisk, create a file there of a specific size, and then mmap into both processes.

  • Keep in mind that you'll need to deal with synchronization between the two processes - different platforms might have different approaches to this, but they traditionally involve using a semaphore of some kind (e.g. on Linux: https://man7.org/linux/man-pages/man7/sem_overview.7.html).

All that being said, traditional shared memory will probably do better than mmap for this use-case. In general, OS-level IPC mechanisms are likely to do better out of the box than hand-rolled solutions - there's a lot of tuning that goes into something to make it perform well, and mmap isn't always an automatic win.

Good luck with the project!

quadrosis
  • 46
  • 2
  • By "traditional shared memory", I'm not sure what you mean. Excuse my ignorance, but I'd greatly appreciate it if you could you link me to some good resource for learning about this in both C/C++ and Python. – JacKeown Feb 14 '21 at 02:50
  • Basically, what multiprocessing.shared_memory does under the hood. It's possible to do the same thing without the calls to that code, though: see https://stackoverflow.com/questions/26114518/ipc-between-python-and-win32-on-windows-os for Windows and https://www.themetabytes.com/2018/05/26/python-inter-process-communication/ ("shared memory with no backing file") for Linux. – quadrosis Feb 15 '21 at 05:10
  • Also, if you're interested in really pushing data quickly, http://concurrencykit.org/ has some pretty good reading material - the articles and slides there are pretty good (I think). Would also suggest https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html if you're in the mood for something more comprehensive - that's not an in-depth primer on shared memory, per se, but it is a good resource for understanding different approaches to both inter-thread and inter-process communication. Hope something in there is useful! – quadrosis Feb 15 '21 at 05:20