The following contents use many levels of markdown item lists which may be difficult to view on the mobile devices.
Recently I also read the same cppreference section and has the same question as you at first and got the idea after viewing related SO QAs and papers. I hope this answer can also help you understand what the cppreference says.
Your question seems to be a duplicate of QA_1 and QA_2, but both QAs seem to not read the original paper_1 which is referenced in the cppreference_1 and paper_2. So here I will give one answer mainly based on the paper_1 and partly based on its reference paper_3 from perspective of mathematics which may help grasp the inner ideas if having mathematics basic knowledge.
These are based on my understanding of the papers and related links. Please point out errors if any, thanks beforehand.
Short answer:
- View the quote in QA_1 question, you will get the occasion where the above conflict occurs based on the old c++11 standard.
And the QA_1 answer says how this can occur in the real world (can be due to the cache consistency which cause different threads see different order of different variables).
- More specifically, view the Figure 3 with its context, (S1fix) context and related terminology definitions in the paper_1
- "A happens-before C" corresponds to
S(k, m)
- A is sequenced before (sb) B (This is because they are in the same thread by cppreference_2)
B synchronized with (sw) C about the variable y
but not about x
, so A happens before C.
(This is definition of "happens before": a sequence composed of sb and sw.)
- the total modification order "C-E-F-A" corresponds to
S(m, o, p, k)
(here I coalesced different S(,)
by order).
- here
S(p,k)
is due to "reads-before" (IMO, this means "reads-before-write" due to
. It is to solve with the WAR hazard).
- Then the above causes the cycle, so in the paper_1 it says "What Went Wrong and How to Fix it" context and it redefines in (S1fix). The c++20 standard is now based on it as the paper_2 says.
- Here you can think that it splits the original cycle (i.e. old c++11 total modification order) into two parts
S(k, m)
and S(m, o, p, k)
where the former stays "happens-before" and the latter is the new total modification order.
- To be more specific, the original question can have one observation order
(A)-B-C-E-D-F-A
(here "(A)" means it runs before but observed later.)
- here A not synchronizes with C, so there is no must that A needs to be observed by C.
Notice: maybe the above optimized memory model introduced in c++20 is still flawed. However, I'm not one compiler/computer architecture expert, so it's beyond my abilities to find the flaws.
TL;DR (search for something you don't understand after reading the paper_1) Detailed answer mainly based on the paper_1:
The following needs some knowledge of discrete mathematics and I add some description for someone temporarily not familiar with them.
If you don't want to be stuck with the "math", then view the following "non-math" part is enough.
Notice: here only some symbols are rephrased, you may better view the original papers if has some questions about some terminology.
math
Part of the following primitive symbol definitions (mainly about relation) can be seen from "Notation 1" in the paper_1 like and "Definition 8" in the like (;)
paper_3 which is referenced in the paper_1.
Here I assume that they take same math primitive symbols in their the papers because paper_1 "Remark 1" says:
The reason we use Batty et al.’s version here is that it provides a cleaner starting point for our discussion, and our solution to the problems with C11’s SC semantics will build on it.
And after viewing the paper_3 footnotes in p5, it is mainly based on the ISO standard and maps them to the pure math which may be more intuitive if having better mathematics knowledge.
(x
meaning.)
Here ;
is composition of relations(i.e.
means one pipeline like [A] -> R -> [B]
).
This means one AB pair has the relation R.
happens-before definition:
non-math
See this (better than cppreference ones because of the 2 levels) which is referenced in both above 2 QAs.
A is dependency-ordered before B, or
This implies including consume.
As the paper_1 says:
Besides the new SC and NO -THIN - AIR conditions, RC11 differs in a few other ways from C11.
It does not support consume accesses
So the above cppreference definition may be more general than the following math definitions from the paper_1.
math

Next, we say that one event happens before (hb) another event if they are connected by a sequence of sb
or sw
edges
+
means transitive (i.e. sequence) and U
means "or".
above S(k,m)
has the "happens-before" relation because of the S(l,m)
synchronization(This is due to l,m
have data dependency implied (i.e. m read the write of l))
- definition of
sw
non-math
think as the rel
,acq
sequence like the other section of cppreference_1 says.
If an atomic store in thread A is a release operation, an atomic load in thread B from the same variable is an acquire operation, and the load in thread B reads a value written by the store in thread A, then the store in thread A synchronizes-with the load in thread B.
math (see the paper_1 for more details)

Next, a release event a synchronizes with (sw) an acquire event b, whenever b (or, in case b is a fence, some sb-prior read) reads from the release sequence of a (or in case a is a fence, of some sb-later write).
- From above, the reflexive symbol
?
implies "or" relation.
- The
means partial order with equal, so
includes the sc
from the bottom-right figure in p7 of the paper_1.
- And the
[F]
is always explicity placed by weak architecture like POWERPC referenced in the paper_1 which is related with rel
or acq
,etc (see this link from the paper_2 for how the compiler adds these fences).
rs; rf
is implied by rel
and acq
is to take RAR (read after read) in account.
- definition of
rs
in the definition of rs
, rf; rmw
is to get the sequence like "write,read,write".
- In summary, the above means "rel_event,(fence-sync),release_relation,read-from,relaxed_read,(fence-sync),acq_event"
- here "relaxed_read" exists because it doesn't influence the relation between
rel
and acq
.
definition of sb
. Here I take the definition in the paper_3 to highlight that they are in the same thread.
definition of the single total order (S
in the paper_1)
relations of "the single total order" and "happens-before" and reasons for changes
original version in c++11 from the paper_1:
(S1) S must include hb restricted to SC events
(formally:
);
[...] (these with no problems found by the paper unchanged)
non-math: S
needs to be conform to the hb (happens-before) order
math: see the above equation.
- How this make problems -> see above "A happens-before C".
why the original c++11 model fails with specific examples?
Because it drops of sync
fences. Then it implies the weaker memory model.
The above 2 quotes mean same about how drops of synchronization fences occurs. (i.e. hwsync
by "Store Seq Cst" by sc
is avoided and only lwsync
exists)
from the paper_1:
So, if requiring that hb (on SC events) be included in S is too strong a condition, what should we require instead?
- So based on the above, it has the following changes:
- how changes make work:
After changing, the example pattern sb;hb
is dropped. (view Figure 3: S(k,l)
is sb
) (Specifically to say, hb=sw
). So the old happens-before A,C
is not take in account now in the total modification order.
- There are other examples also work after changes, try them if you are interested.