Can a compiler invent multiple loads when compiling one std::atomic load using memory_order_relaxed?

Question

Taking the following piece of code

struct SharedMemory
{
  std::atomic<float> value;

  bool shouldReturnTrue()
  {
    float tmp = value.load(std::memory_order_relaxed);
    float a = tmp;
    float b = tmp;
    return a == b;
  }
};

Is it legal for the compiler to optimize it as follow and so break the thread safety ?

struct SharedMemory
{
  std::atomic<float> value;

  bool shouldReturnTrue()
  {
    float a = value.load(std::memory_order_relaxed);
    float b = value.load(std::memory_order_relaxed);
    return a == b;
  }
};

If so, should I force it to not optimize tmp using volatile keyword as follow ?

struct SharedMemory
{
  std::atomic<float> value;

  bool shouldReturnTrue()
  {
    volatile float tmp = value.load(std::memory_order_relaxed);
    float a = tmp;
    float b = tmp;
    return a == b;
  }
};

`std::memory_order` is used to 'constraint' **between** atomic operation. here you have only one atomic operation. — Jarod42, Mar 16 '22 at 11:34
Do you mean that even without specifying `std::memory_order_relaxed` I should use `volatile` anyway ? — Jojolastiti, Mar 16 '22 at 11:50
You should probably never use `volatile` (unless you use special memory). — Jarod42, Mar 16 '22 at 12:17
You need to read about atomics in something like a textbook. Guessing your way through atomics is pretty much impossible. — Passer By, Mar 16 '22 at 13:16
Of course it's illegal. All the weirdness of `memory_order_relaxed` only affects what number `load` can return (in this case). It doesn't allow the compiler to remove `tmp` completely. — HolyBlackCat, Mar 16 '22 at 18:44

Peter Cordes · Answer 1 · 2022-03-16T18:38:53.600

No, that wouldn't be a legal optimization. As you say, it could return false for cases other than tmp being NaN, if there was a writer. Therefore it wouldn't be valid per the as-if rule, because the C++ abstract machine does always return !isnan(tmp).

(If you'd picked a type like int where self == self was always true, it would be easier to talk about!)

The transformation you ask about is only legal for non-atomic variables, precisely because the value can be assumed not to be changing. If it is, that would be data-race undefined behaviour, and compilers are allowed to assume no UB. Thus they can invent another non-atomic non-volatile load if that's more convenient for register allocation.

But an atomic variable does have to be assumed to change value between any two reads (in this way similar to volatile in terms of what compilers have to assume about it), because that is well-defined behaviour that compilers have to respect.

So the compiler will never invent reads of atomic<> objects. (That linked article is about Linux kernel code, where they use volatile for hand-rolled atomics on compilers that are known to work the way they want, for the same reasons we use std::atomic<> in modern C++.)

memory_order_relaxed means that the ordering wrt. surrounding memory accesses is relaxed, not the atomicity guarantees. In this respect, relaxed isn't different from acquire or seq_cst. I guess you had some misconception that relaxed might be relevant here, but it isn't.

You're right that assigning the load result to volatile float tmp would make it less likely for a compiler to want to invent loads, but if that was a real problem (it isn't), a better way would be to use volatile atomic<float> value;. Again, this is not necessary. Especially now, since compilers currently don't optimize atomics at all (Why don't compilers merge redundant std::atomic writes?) but even in future with compilers that only give you the bare minimum ISO C++ standard and do optimize atomics by e.g. coalescing redundant back-to-back reads into one, they won't be allowed to do the other thing and invent extra writes, or invent extra reads and assume the value is the same.

Can a compiler invent multiple loads when compiling one std::atomic load using memory_order_relaxed?

1 Answers1