C++ share atomic_int64_t between different process?

Question

I am using C++ multi-processing passing data from one to another using shared memory. I put an array in the shared memory. Process A will copy data into the array, and Process B will use the data in the array. However, process B need to know how many item are in the array.

Currently I am using pipe/message queue to pass the array size from A to B. But I thinks I might put an atomic variable(like atomic_uint64_t) in the shared memory, modify it in process A and load it in process B. However I have the following questions.

Is Atomic variable in C++ still atomic between process? I know atomic is implemented locking the cache line, neither another thread nor process can modify the atomic variable. So I think the answer is Yes.
How exactly should I shared a atomic variable between? Can any one give an example?

There are no processes in C++ nor shared memory. That's an extension to the language and implementation defined. But yeah, `std::atomic` will just work for shared memory. — Goswin von Brederlow, Jun 22 '22 at 10:53

Homer512 · Accepted Answer · 2022-06-22T11:40:53.883

Atomics will work, for the most part. Some caveats:

Only use lock-free atomics. See Are lock-free atomics address-free in practice?
Obviously pointers won't work, for example std::atomic<std::shared_ptr<T>>
std::atomic<T>::wait, and notify_one/all may not work

Technically, the behavior is non-portable (and lock-free does not guarantee address-free) but basic use including acquire-release should work on all mainstream platforms.

If you already have shared memory, the use should be pretty simple. Maybe something like this:

struct SharedBuffer
{
    std::atomic<std::size_t> filled;
    char buf[];
};

SharedBuffer* shared = static_cast<SharedBuffer*>(
      mmap(..., sizeof(SharedBuffer) + size, ...));

fill(shared->buf);
shared->filled.store(size, std::memory_order_release);

Note that you still have to solve the issue of notifying the other process. To the best of my knowledge, you cannot use std::condition variables and std::mutex. But the OS-specific types may work. For example for pthreads, you need to set pthread_mutexattr_setpshared and pthread_condattr_setpshared.

Maximizing portability

int64_t may be a bit risky if your CPU architecture is 32 bit and doesn't come with 64 bit atomics. You can check at runtime with atomic_is_lock_free or at compile time with is_always_lock_free.

Similarly, size_t may be risky if you want to mix 32 bit and 64 bit binaries. Then again, when targeting mixed binaries, you have to limit yourself to less than 32 bit address space anyway.

If you want to provide a fallback for missing lock-free atomics, atomic_flag is guaranteed to be lock-free. So you could roll your own spinlock. Personally, I wouldn't invest the time, however. You already use OS-facilities to set up shared memory. It is reasonable to make some assumptions about the runtime platform.

ISO C++ says lock-free atomics *should* be address-free (http://eel.is/c++draft/atomics.lockfree), so not guaranteed portable, but encouraged by the standard to work in shared memory between processes. ([Are lock-free atomics address-free in practice?](https://stackoverflow.com/q/51463312)) That note doesn't seem to have changed for C++20 wait/notify, but yeah that probably requires some care. It might or might not still work across processes. — Peter Cordes, Jun 22 '22 at 11:14
I'm not sure if Linux `futex` needs any extra care to use across processes, or how other OSes may differ, so prob. roll your own fallback from spinning to sleeping like before C++20. Or just spin (with a `pause` loop or something) if your use-case will almost always see a value promptly, not actually need to sleep. — Peter Cordes, Jun 22 '22 at 11:21
@PeterCordes futex has a flag ```FUTEX_PRIVATE_FLAG``` to make the operation process-local. I assume it is set with all standard operations for performance reasons. So I guess ```atomic::wait``` will not work cross-process unless the standard forces it to do so — Homer512, Jun 22 '22 at 11:25
@PeterCordes interestingly the man-page says process-local, not anonymous memory. So I could still mmap the same memory twice in the same process and use one futex through two different pointers. Unless the man-page is wrong. — Homer512, Jun 22 '22 at 11:27
Ah, yeah probably so, then. `strace` could show what any given compiler/library version uses now. (But even if some implementation doesn't use PRIVATE futex, another future version might.) — Peter Cordes, Jun 22 '22 at 11:28

score 1 · Answer 2 · answered Jun 23 '22 at 08:55

Thanks for the answers above. I write a simple code to verify whether we can simply use store(), load() and ++ in shared memory. The answer is YES. This is the code:

#include <string.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <atomic>

void* create_shared_memory(size_t size) {
    int protection = PROT_READ | PROT_WRITE;

    int visibility = MAP_SHARED | MAP_ANONYMOUS;

    return mmap(nullptr, size, protection, visibility, -1, 0);
}

struct SharedBuffer
{
    std::atomic<uint64_t> filled;
    int arr[1024];
};


int main() {
    void* shmem = create_shared_memory(sizeof(SharedBuffer));
    auto data = static_cast<SharedBuffer*>(shmem);
    data->filled.store(0);

    int pid = fork();

    uint64_t countInThisThread = 0;
    if (pid == 0) {
        while(data->filled.load() < 1024ULL * 1024) {
            if (data->filled.load() % 2 == 0) {
                data->filled++;
                countInThisThread++;
            }
        }
        printf("++ in child process: %lu\n", countInThisThread);
    } else {
        while(data->filled.load() < 1024ULL * 1024) {
            if (data->filled.load() % 2 == 1) {
                data->filled++;
                countInThisThread++;
            }
        }
        printf("++ in parent process: %lu\n", countInThisThread);
    }
    munmap(shmem, sizeof(SharedBuffer));
    return 0;
}
// Output
// ++ in parent process: 524288
// ++ in child process: 524288

I put a atomic<uint64_t> in the shared memory and add 1 to it in different process. Each process added 524288 times.

Be aware that this is only tested on 64 bit ubuntu 20.04 with g++7.5.0. More functions of atomic like wait/notify is not tested.

C++ share atomic_int64_t between different process?

2 Answers2

Maximizing portability

Linked

Related