3

I am trying to publish some random things over shared memory; and for some weird reason, the reader doesn't pick up what the sender has written

#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
#include <sys/types.h>
#include <cstdio>

class SHM {
    volatile char* _ptr;
public:
    SHM() {
        const auto handle = shm_open("myTest", O_RDWR|O_CREAT, 0666);
        const auto size =  4 * 1024 * 1024;
        if (-1 == ftruncate(handle, size)) {
            throw;
        }
        _ptr = (volatile char*)mmap(0,size , PROT_READ | PROT_WRITE, MAP_SHARED, handle, 0);

        if(_ptr == MAP_FAILED){
            throw;
        }

                int rc = fchmod(handle, 0666);
                if (rc == -1) {
            throw;
                }
    }

    bool read(uint64_t& magic, uint64_t& time) {
        const uint64_t newVal = *(uint64_t*)_ptr;
        if (newVal != magic) {
            magic = newVal;
            printf("value changed!!!\n");
            time = *(uint64_t*)(_ptr + sizeof(magic));
            return true;
        }
        //printf("old value: %lu\n", newVal);
        return false;
    }

    void publish(const uint64_t time) {
        __sync_fetch_and_add((uint64_t*)_ptr, time);
        __sync_synchronize();
        *(uint64_t*)(_ptr + sizeof(uint64_t)) = time;
    }
};

Here is the sender:

#include <ctime>
#include <unistd.h>
#include <cstdlib>
#include <cstdint>
#include "shm.h"

int main() {
    SHM shm;
    timespec t;
    for (auto i = 0; i < 10000; i++) {
        if (0 == clock_gettime(CLOCK_REALTIME, &t)) {
            const uint64_t v = t.tv_sec * 1000 * 1000 * 1000 + t.tv_nsec;
            shm.publish(v);
            printf("published %lu\n", v);
            usleep(100);
        }
    }
}

Here is the reader:

#include <iostream>
#include "shm.h"

int main() {
    SHM shm;
    uint64_t magic = 0;
    uint64_t t = 0;
    while (true) {
        if (shm.read(magic, t)) {
            printf("%lu, %lu\n", magic, t);
        }
    }
}

If I restart the reader, the reader is indeed able to read the last value that the sender has written.

However, if I start the reader first, and then the sender, all the values the sender writes aren't picked up by the reader.

To make this even weirder, if I uncomment the printf statement in SHM::read(), then the reader is able to pick up sometimes.

Any idea?

GCC version:

g++ (GCC) 7.2.1 20170829 (Red Hat 7.2.1-1)
HCSF
  • 2,387
  • 1
  • 14
  • 40

1 Answers1

1

I spotted a couple of issues, however, I am unsure if they would fix your problem.

  1. name for shm_open should start with / for portable use.
  2. In read and publish the casts must not discard volatile. E.g.: const uint64_t newVal = *(uint64_t volatile*)_ptr;. Even better, drop volatile and use std::atomic.

Although there are different processes involved, this is still the case of same objects being accessed by more than one thread of execution and at least one of these threads modifies the shared objects.


I made the above changes. Using std::atomic fixed it:

class SHM {
    void* _ptr;
public:
    SHM() {
        const auto handle = shm_open("/myTest", O_RDWR|O_CREAT, 0666);
        const auto size =  4 * 1024 * 1024;
        if (-1 == ftruncate(handle, size))
            throw;

        _ptr = mmap(0,size , PROT_READ | PROT_WRITE, MAP_SHARED, handle, 0);

        if(_ptr == MAP_FAILED)
            throw;
    }

    bool read(uint64_t& magic, uint64_t& time) {
        auto p = static_cast<std::atomic<uint64_t>*>(_ptr);
        const uint64_t newVal = p[0];
        if (newVal != magic) {
            magic = newVal;
            printf("value changed!!!\n");
            time = p[1];
            return true;
        }
        return false;
    }

    void publish(const uint64_t time) {
        auto p = static_cast<std::atomic<uint64_t>*>(_ptr);
        p[0] += time;
        p[1] = time;
    }
};

void sender() {
    SHM shm;
    timespec t;
    for (auto i = 0; i < 10000; i++) {
        if (0 == clock_gettime(CLOCK_REALTIME, &t)) {
            const uint64_t v = t.tv_sec * 1000 * 1000 * 1000 + t.tv_nsec;
            shm.publish(v);
            printf("published %lu\n", v);
            usleep(100);
        }
    }
}

void reader() {
    SHM shm;
    uint64_t magic = 0;
    uint64_t t = 0;
    while (true) {
        if (shm.read(magic, t)) {
            printf("%lu, %lu\n", magic, t);
        }
    }
}

int main(int ac, char**) {
    if(ac > 1)
        reader();
    else
        sender();
}

With std::atomic you can have more control. E.g.:

struct Data {
    std::atomic<uint64_t> time;
    std::atomic<uint64_t> generation;
};

// ...

    bool read(uint64_t& generation, uint64_t& time) {
        auto data = static_cast<Data*>(_ptr);

        auto new_generation = data->generation.load(std::memory_order_acquire); // 1. Syncronizes with (2).
        if(generation == new_generation)
            return false;

        generation = new_generation;
        time = data->time.load(std::memory_order_relaxed);
        printf("value changed!!!\n");
        return true;
    }

    void publish(const uint64_t time) {
        auto data = static_cast<Data*>(_ptr);

        data->time.store(time, std::memory_order_relaxed);
        data->generation.fetch_add(time, std::memory_order_release);  // 2. (1) Synchronises with this store.
    }
Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
  • #2 is indeed the issue -- compiler probably optimized out `const uint64_t newVal = *(uint64_t*)_ptr`. I wonder up to which point I can stop using volatile qualifier because I thought declaring `volatile char* _ptr` is sufficient to tell the compiler not to optimize out the access to `_ptr` as `_ptr` has a qualifier volatile. Hope you can share some light. For #1, why starting / is portable? Thanks! – HCSF Jul 04 '18 at 16:14
  • @HCSF #1 See http://man7.org/linux/man-pages/man3/shm_open.3.html, second paragraph. – Maxim Egorushkin Jul 04 '18 at 16:18
  • 1
    @HCSF #1 http://pubs.opengroup.org/onlinepubs/9699919799/functions/shm_open.html: _If name begins with the character, then processes calling shm_open() with the same value of name refer to the same shared memory object, as long as that name has not been removed._ – Maxim Egorushkin Jul 04 '18 at 16:21
  • @HCSF #2 If you cast to `(uint64_t volatile*)` consistently everywhere it accesses `_ptr` it also works as expected. But `std::atomic` is the right way to do it. – Maxim Egorushkin Jul 04 '18 at 16:23
  • thanks for providing an example. What if _ptr is pointed a C struct (instead of uint64_t)? `auto p = static_cast*>(_ptr); auto v = p->load();` and then access the fields v? The load() likely generates an implicit lock if S is large right? If I stick with `auto p = static_cast(_ptr);`, it seems like an implicit lock won't be generated, right? Just trying to find a solution that has a lower latency impact. If yes, do I need to declare all fields in S volatile as well? As it seems like compiler doesn't inherently respect `volatile` when I access _ptr. thx – HCSF Jul 04 '18 at 16:49
  • @HCSF I advise you against using `volatile` for multi-threading/processing ever. http://isvolatileusefulwiththreads.com/ – Maxim Egorushkin Jul 04 '18 at 17:35
  • @HCSF Lock a process-shared mutex or spin-lock, read/write the structure, release. No need for volatile or `atomic`. – Maxim Egorushkin Jul 04 '18 at 17:44
  • Right. I agree that volatile should not be used for synchronizing between threads as it won't work (unlike volatile in Java). Tho, in my case, I was trying to use volatile for communicating between 2 processes over shared memory. I thought that volatile is [the recommended way](https://stackoverflow.com/questions/8819095/concurrency-atomic-and-volatile-in-c11-memory-model) – HCSF Jul 05 '18 at 01:33
  • @HCSF It is still the case of sharing memory between two different threads. Doesn't make difference here whether threads are from the same process or not. – Maxim Egorushkin Jul 05 '18 at 09:10
  • So after C++11, std::atomic should be used instead of volatile for dealing with mapped shared memory? – HCSF Jul 05 '18 at 12:14