0

Both C11 and C++11 standards define that concurrent non-atomic reading and writing the same memory location is a data race which leads to an UB, thus such a program may do virtually anything. OK, got it. I want to understand the reasoning for this overly (for me today) strict requirements.

I'm playing with yet another IPC mechanism exploiting that bad racy memory access. I won't bother you with all the gore details, let me show the simplified version instead:

template <typename T>
class sval {

    static_assert(std::is_trivially_copy_assignable<T>::value, "");
    static_assert(std::is_trivially_destructible<T>::value, "");

    static constexpr unsigned align = 64;

    unsigned last;
    alignas(align) std::atomic<unsigned> serial;

    struct alignas(align) {
        T value;
    } buf[2];

public:

    sval(): last(0), serial(last) {}

    sval(sval &) = delete;
    void operator =(sval &) = delete;

    void write(const T &a) {
        ++last;
        buf[last & 1].value = a;
        serial.store(last, std::memory_order_release);
    }

    class reader {

        const sval &sv;
        unsigned last;

    public:

        reader(const sval &sv): sv(sv), last(sv.serial.load(std::memory_order_relaxed)) {}

        bool read(T &a) {
            unsigned serial = sv.serial.load(std::memory_order_acquire);

            if (serial == last) {
                return false;
            }

            for (;;) {
                a = sv.buf[serial & 1].value;
                unsigned check = sv.serial.load(std::memory_order_seq_cst);

                if (check == serial) {
                    last = check;
                    return true;
                }

                serial = check;
            }
        }

    };

};

It's a shared value with single writer and multiple readers. It contains two buffers underneath, the buf[serial & 1] is the 'sealed' one and another one may be being updated. The writer's logic is very simple and (that's the main feature) it isn't affected by readers presence and activity. But a reader has to do more to guarantee the consistency of fetched data (and this is my main question here):

  1. read the serial number
  2. read the buf[serial & 1]
  3. read the serial again and retry if it was changed
So it may read garbage data in the middle, but it does the check afterwards. Is it still possible that something bad will leak out of the read() internals? If so, what's the exact reasons either from hardware side or somewhere else?

Below is the example application. I tested both this app and my original more complex idea on x86 and ARM. But neither the tests variety nor the platforms coverage make me very confident about my ideas.

int main() {
    const int N = 10000000;
    sval<int> sv;
    std::thread threads[2];

    for (auto &t : threads) {
        t = std::thread([&sv] {
                sval<int>::reader r(sv);
                int n = 0;

                for (int i = 0, a; i < N; i = a) {
                    while (!r.read(a)) {
                    }

                    assert(a > i);

                    ++n;
                }

                std::printf("%d\n", n);
            });
    }

    int dummy[24];

    for (int i = 1; i <= N; ++i) {
        std::rotate(std::begin(dummy), dummy + 11, std::end(dummy));
        sv.write(i);
    }

    for (auto &t : threads) {
        t.join();
    }
}
Deduplicator
  • 44,692
  • 7
  • 66
  • 118
  • 4
    According to C and C++, yes bad things may still happen. But the language doesn't prohibit the toolchain from providing a stronger guarantee, leveraging the behavior of the processor architecture memory model. (You don't get a stronger guarantee by default, optimizers might not preserve guarantees made by the hardware.) – Ben Voigt Aug 30 '14 at 22:29
  • Yes, undefined behavior can lead to reasoning backwards in time: http://stackoverflow.com/questions/23153445/can-branches-with-undefined-behavior-be-assumed-unreachable-and-optimized-as-dea. A program can behave erratically even before the UB is executed. – usr Aug 30 '14 at 23:36
  • I know all these things, as I said at the beginning. Yes, exploiting an UB voids the warranty. My question is about reasons behind the standards. If my algorithm is wrong (heck, vague word, let's try to apply the common sense), I'd like to understand the reason. Otherwise I'd like to find a way to express it in C or C++ without an UB. – Constantin Baranov Aug 31 '14 at 00:34
  • That's not the question you ask in the title (That has "yes" as an answer). You should probably clarify the title to make clear that you are asking for realistic circumstances under which this can break and how to fix it. – usr Aug 31 '14 at 10:17
  • Yes, hoisting loads out of loops leads to infinite loops. [MCU programming - C++ O2 optimization breaks while loop](//electronics.stackexchange.com/a/387478) – Peter Cordes Nov 05 '19 at 12:59

0 Answers0