0

Consider the below code:

static std::atomic<int> x;

// Thread 1
x.store(7, std::memory_order_relaxed);

// Thread 2
x.load(std::memory_order_relaxed);

Further assume that Thread 2 executed the load a few cycles after Thread 1 executed the store.

Is it guaranteed that Thread 2 would read the value 7 on all hardware platforms? In other words, after Thread 1 executed the store, is it guaranteed that the value 7 would be immediately visible to threads the do a relaxed load of the same variable later?

cmutex
  • 1,478
  • 11
  • 24

1 Answers1

2

Yes, it will be visible. It has become somewhat a common question regarding relaxed memory orders and what they do in contrast to stronger memory orders.

A memory order stronger then memory order relaxed does the following two things:

  • synchronizes non atomic data
  • prevent reordering of instructions in the same thread

Different orders guarantee different synchronization strategies (only reads, only writes, reads and writes) and different reordering preventations (only what is executed before the instruction, only what is executed after the instruction or both). Memory order relaxed doesn't guarantee any of those, it only guarantees that atomic variables are visible across threads.

In your example there is no other instruction other than the store in one thread, and a load in another. Plus, there is no non-atomic memory to synchronize across threads, the only memory is already atomic.

So there is no need to use something heavier than memory_order_relaxed. The snippet above is valid and will produce the wanted result.

David Haim
  • 25,446
  • 3
  • 44
  • 78
  • "Yes" is far too simplistic an answer, IMO. The OP is talking about actual time / cycles, not threads with a "synchronizes-with" / "happens before" relationship. They said *Thread 2 executed the load **a few cycles** after Thread 1 executed the store*. Cache is coherent so it will eventually become visible without any explicit flushing, but to "execute" a store in CPU architecture terms just means to put it in the private store buffer, to eventually commit (and become visible) some time after the execution is known to be non-speculative. This is key to OoO exec of stores. – Peter Cordes Feb 09 '21 at 21:59
  • See [Can a speculatively executed CPU branch contain opcodes that access RAM?](https://stackoverflow.com/q/64141366) for more detail about store buffers and execution. But anyway, the key here is that other cores can keep loading the old value right up until they receive an RFO (Read For Ownership) from the core trying to take ownership of the cache line so it can commit its store. (Store visibility is delayed by the store buffer. This is also how you get StoreLoad reordering if thread 1 also did some later loads, even on a strongly-ordered CPU that keeps everything else in order.) – Peter Cordes Feb 09 '21 at 22:03
  • I posted an answer on the linked duplicate ([memory\_order\_relaxed and visibility](https://stackoverflow.com/q/66054666)). – Peter Cordes Feb 09 '21 at 23:11
  • 1
    I disagree with your points in general. Not because their not correct, because their irrelevant. The only thing we care about is the C++ standard. The C++ standard dictates that a store is visible to a load if the store actually happens before the load and the store + load are atomic. This is the case here. there is nothing "Too simplistic about it" and if there is - you're welcome to propose a more detailed standard for C++. You DO NOT have to understand CPU machinery to write good/safe concurrency code in C++. all you need is in the standard already. – David Haim Feb 10 '21 at 08:54
  • Then the correct answer to what the OP is asking is "C++ doesn't have any notion of 'same time' other than what threads observe in memory". i.e. it's not a question C++ can answer. You're exactly right that you don't have to understand CPU machinery to write good/safe code in C++, and your answer should say so and tell the OP that they're asking the wrong question. Don't just answer a different one without mentioning it! – Peter Cordes Feb 10 '21 at 09:10
  • I think you're overfocusing on the term "A few cycles after" .. you can just replace it with "after". To remind you, the OP accepted my answer. I understood exactly what he was asking. – David Haim Feb 10 '21 at 09:12
  • 1
    The OP is apparently thinking in CPU terms, and you're answering in C++ terms which assume the existence of a "happens-before" / "synchronizes-with" relationship between threads. The OP probably thought you were answering the question in terms of their understanding of "after", so them accepting your answer doesn't prove they meant the same question you answered. – Peter Cordes Feb 10 '21 at 09:16