Visibility of change to shared memory from shm_open() + mmap()

Question

Let's say I am on CentOS 7 x86_64 + GCC 7.

I would like to create a ringbuffer in shared memory.

If I have two processes Producer and Consumer, and both share a named shared memory, which is created/accessed through shm_open() + mmap().

If Producer writes something like:

struct Data {
uint64_t length;
char data[100];
}

to the shared memory at a random time, and the Consumer is constantly polling the shared memory to read. Will I have some sort of synchronization issue that the member length is seen but the member data is still in the progress of writing? If yes, what's the most efficient technique to avoid the issue?

I see this post: Shared-memory IPC synchronization (lock-free)

But I would like to get a deeper, more low level of understanding what's required to synchronize between two processes efficiently.

Thanks in advance!

score 3 · Answer 1 · edited Jun 20 '20 at 09:12

To avoid this, you would want to make the structure std::atomic and access it with acquire-release memory ordering. On most modern processors, the instructions this inserts are memory fences, which guarantee that the writer wait for all loads to complete before it begins writing, and that the reader wait for all stores to complete before it begins reading.

There are, in addition, locking primitives in POSIX, but the <atomic> header is newer and what you probably want.

What the Standard Says

From [atomics.lockfree], emphasis added:

Operations that are lock-free should also be address-free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes.

For lockable atomics, the standard says in [thread.rec.lockable.general], emphasis added:

An execution agent is an entity such as a thread that may perform work in parallel with other execution agents. [...] Implementations or users may introduce other kinds of agents such as processes [....]

You will sometimes see the claim that the standard supposedly makes no mention of using the <atomic> primitives with memory shared between processes, only threads. This is incorrect.

However, passing pointers to the other process through shared memory will not work, as the shared memory may be mapped to different parts of the address space, and of course a pointer to any object not in shared memory is right out. Indices and offsets of objects within shared memory will. (Or, if you really need pointers, Boost provides IPC-safe wrappers.)

Right, I think std::atomic is very useful for sharing data among threads. But this time, I would like to share among processes. — Hei, Feb 15 '18 at 09:59
Added what the standard says about using `` with memory shared between processes. — Davislor, Feb 15 '18 at 10:14
thanks for quoting the standard. I think I can use atomic for the member length. But how about the member data? Thanks! — Hei, Feb 15 '18 at 10:29
@Hei please don't comment to say "thank you", upvote and/or accept the answer. — YSC, Feb 15 '18 at 10:30
@Hei Dude! Davislor's answer is really good. I understand you want to wait a few minutes before accepting an answer, but at least upvote it. They spent time and effort to help you with something valuable. It costs you _nothing_ to upvote. By not doing it, you shoot yourself in the foot. — YSC, Feb 15 '18 at 10:34
Done. Didn't know it is urgent to upvote. Will keep that in mind in the future. — Hei, Feb 15 '18 at 10:37

YSC · Answer 2 · 2018-02-15T09:47:03.363

Yes, you will ultimately run into data races, not only length being written and read before data is written, but also parts of those members will be written out of sync of your process reading it.

Although lock-free is the new trend, I'd suggest to go for a simpler tool as your first IPC sync job: the semaphore. On linux, the following man pages will be useful:

The idea is to have both processes signal the other one it is currently reading or writing the shared memory segment. With a semaphore, you can write inter-process mutexes:

Producer:
while true:
    (opt) create resource
    lock semaphore (sem_wait)
    copy resource to shm
    unlock semaphore (sem_post)

Consumer:
while true:
    lock semaphore (sem_wait)
    copy resource to local memory
        or crunch resource
    unlock semaphore (sem_post)

If for instance Producer is writing into shm while Consumer calls sem_wait, Consumer will block until after Producer will call sem_post, but, you have no guarantee Producer won't go for another loop, writing two times in a row before Consumer will be woke up. You have to build a mechanism unsure Producer & Consumer do work alternatively.

Thanks for your quick response. The links are useful. Reading now. I seems like some sort of memory barrier is involved? As it needs to somehow invalidate the cache Consumer has to force a cache miss or alike and so length and data will be fetched from the shared memory. That's the "deeper, more low level of understanding" I am trying to get to. Thanks! — Hei, Feb 15 '18 at 09:58

Visibility of change to shared memory from shm_open() + mmap()

2 Answers2

What the Standard Says

Linked