Consider the code below:
std::atomic<int> a = 100;
---
CPU 0:
a.store(101, std::memory_order_relaxed);
---
CPU 1:
int tmp = a.load(std::memory_order_relaxed); // Assume `tmp` is 101.
Let's assume that CPU 0 happens to store to a
earlier in time before CPU 1 loads a
(whether the load is reordered or not). Thus, in this scenario, tmp
will be 101 instead of 100.
If the MOESI coherence protocol is used, then when CPU 0 stores to a
, CPU 0 acquires the cache line in modified (M) mode. The store goes to CPU 0's store buffer. If CPU 1 had the cache line in its own cache, then its copy of the cache line transitions to invalid (I) mode.
When CPU 1 loads a
, the cache line is transitioned to shared (S) mode (or maybe owned (O) mode).
Assume that a
is still in CPU 0's store buffer when CPU 1 loads a
. Given that CPU 1 cannot read CPU 0's store buffer, then when CPU 1 reads the cache line with a
, does this imply that CPU 0's store buffer is flushed (or at least, the cache line with a
is flushed from CPU 0's store buffer)?
If the flush did not happen, then this implies that both CPU 0 and CPU 1 both have the cache line in shared (S) mode, but CPU 0 sees a
with the value of 101 and CPU 1 sees a
with a value of 100.
Note: I am asking about MOESI while each microarchitecture implements its own coherence protocol. I would imagine that this concern is handled similarly in most microarchitectures though.