5

I am using Annoy library that is using mmap() to load some multi-Gb files into RAM memory. The goal of using mmap() is to load the file only once in memory even when different processes need it.

Using docker, I plan to scale with multiple containers executing the same script on the same host. But the multi-Gb file should be loaded only once in RAM (which is why we use mmap()), otherwise my server will explode.

The multi-Gb file is located in a volume mounted on my containers.

But I still need to find a way to share RAM between containers so that I get the advantages of mmap().

I found this article about using the --ipc tag in docker, but I don't know if it applies to my case and how to implement it. Any help welcome.

jww
  • 97,681
  • 90
  • 411
  • 885
Robycool
  • 1,104
  • 2
  • 14
  • 26
  • 1
    Are you scaling programatically, auto-launching containers? If you use the `--ipc` flag you can use the memory namespace from a "host" (basically just choose one to be the master) container in other containers. From your example article, master would be "producer", all other containers "consumers" – trker May 27 '19 at 15:45
  • @trker yes I am scaling by adding replicas in a docker-compose.yml file, in which I will add the --ipc flag. Can you confirm my understanding is correct (I am newbe in ram memory stuff): (1) sharing memory namespace = sharing ram. (2) Although the path of my multi-Gb file will be different accross containers, linux will automatically detect that it is the same file and hence load it only once in RAM. No additional config is needed for this in docker or the annoy library. – Robycool May 28 '19 at 07:53
  • 1) yes, [`IPC (POSIX/SysV IPC) namespace provides separation of named shared memory segments, semaphores and message queues.`](https://docs.docker.com/engine/reference/run/#ipc-settings---ipc) 2) If you are mounting the volume the same way in each container, I don't see why the path would be different, but yes with IPC you are accessing the same file in memory – trker May 28 '19 at 15:31

1 Answers1

0

--ipc is a red herring. For local volumes no action is required, if it is the same file then memory will be shared. I suspect the same for remote volumes, but can't confirm that remote volumes will not multiply mounted.

Timothy Baldwin
  • 3,551
  • 1
  • 14
  • 23