P0668R5 made some changes to the sequentially-consistent ordering. The following example, from the proposal (also from cppreference), describes the motivation for the modification.
// Thread 1:
x.store(1, std::memory_order_seq_cst); // A
y.store(1, std::memory_order_release); // B
// Thread 2:
r1 = y.fetch_add(1, std::memory_order_seq_cst); // C
r2 = y.load(std::memory_order_relaxed); // D
// Thread 3:
y.store(3, std::memory_order_seq_cst); // E
r3 = x.load(std::memory_order_seq_cst); // F
where the initial values of x
and y
are 0
.
According to the proposal, r1
is observed to be 1
, r2
is 3
, and r3
is 0
. But this is not allowed by the pre-modified standard.
The indicated outcome here is disallowed by the current standard: All
memory_order_seq_cst
accesses must occur in a single total order, which is constrained to haveF
beforeA
(since it doesn't observe the store), which must be beforeC
(since it happens before it), which must be beforeE
(since thefetch_add
does not observe the store, which is forced to be last in modification order by the load in Thread 2). But this is disallowed since the standard requires the happens before ordering to be consistent with the sequential consistency ordering, andE
, the last element of the sc order, happens beforeF
, the first one.
To solve this problem, C++20 changed the meaning of strongly happens-before (the old meaning was renamed to simply happens-before). According to the modified rule, although A
happens-before C
, A
does not strongly happens-before C
, so A does not need to precede C
in the single total order.
I'm wondering about the result of the modification. According to cppreference, the single total order of memory_order_seq_cst
is C-E-F-A
(I don't know why). But according to the happens-before rule, A
still happens-before C
, so the side effects of A
should be visible to C
. Does this mean that A
precedes C
in the modification order seen by thread 2? If so, does this mean that the single total order seen by all threads is not consistent? Can someone explain the above example in detail?