1

P0668R5 made some changes to the sequentially-consistent ordering. The following example, from the proposal (also from cppreference), describes the motivation for the modification.

// Thread 1:
x.store(1, std::memory_order_seq_cst); // A
y.store(1, std::memory_order_release); // B
// Thread 2:
r1 = y.fetch_add(1, std::memory_order_seq_cst); // C
r2 = y.load(std::memory_order_relaxed); // D
// Thread 3:
y.store(3, std::memory_order_seq_cst); // E
r3 = x.load(std::memory_order_seq_cst); // F

where the initial values of x and y are 0.

According to the proposal, r1 is observed to be 1, r2 is 3, and r3 is 0. But this is not allowed by the pre-modified standard.

The indicated outcome here is disallowed by the current standard: All memory_order_seq_cst accesses must occur in a single total order, which is constrained to have F before A (since it doesn't observe the store), which must be before C (since it happens before it), which must be before E (since the fetch_add does not observe the store, which is forced to be last in modification order by the load in Thread 2). But this is disallowed since the standard requires the happens before ordering to be consistent with the sequential consistency ordering, and E, the last element of the sc order, happens before F, the first one.

To solve this problem, C++20 changed the meaning of strongly happens-before (the old meaning was renamed to simply happens-before). According to the modified rule, although A happens-before C, A does not strongly happens-before C, so A does not need to precede C in the single total order.

I'm wondering about the result of the modification. According to cppreference, the single total order of memory_order_seq_cst is C-E-F-A (I don't know why). But according to the happens-before rule, A still happens-before C, so the side effects of A should be visible to C. Does this mean that A precedes C in the modification order seen by thread 2? If so, does this mean that the single total order seen by all threads is not consistent? Can someone explain the above example in detail?

Pluto
  • 910
  • 3
  • 11
  • Does this answer your question? [What is the significance of 'strongly happens before' compared to '(simply) happens before'?](https://stackoverflow.com/q/70554277/2752075) – HolyBlackCat Aug 22 '22 at 14:25
  • @HolyBlackCat I have read your answer in detail before posting this question, but I still don't quite understand it. For example, the modification order observed by thread 2 here seems to be inconsistent with single total ordering, does this violate the standard's requirement for `memory_order_seq_cst`? – Pluto Aug 22 '22 at 14:36
  • You mean the modification order of `y` is inconsistent with the global seq-cst order? Yes, this is now legal. Those two orders don't affect each other; instead, they're affected by similar things: the former is defined by "(simply) happens before" ([`[intro.races]/14..18`](http://eel.is/c++draft/intro.races#14)), and the latter is defined by "strongly happens before" and "coherence-ordered before" ([`[atomics.order]/4`](http://eel.is/c++draft/atomics.order#4)). Since "strongly happens before" now imposes weaker requirements than "happens before", the seq-cst order can become wacky. – HolyBlackCat Aug 22 '22 at 17:33
  • @HolyBlackCat According to the revised rules, happens before does not determine a single total order. In this example, however, operation `A` happens before operation `C`, so in the global order **observed by thread 2**, `A` should come before `C`, but `A` comes after `C` in the actual single total order. Does this cause not all threads to observe a **consistent** global seq-cst order? – Pluto Aug 23 '22 at 02:28
  • Yes, the seq-cst order is consistent across all threads. It boils down to what 'observing' it means. The understanding I ended up with is that the only thing affected by seq-cst (as opposed to just acq/rel) ([in absence of fences](https://stackoverflow.com/q/70577560/2752075)) is the values you get from seq-cst loads. C still happens after A (which I consider the "true" execution order), but C has a weird position in the seq-cst order, so the load returns a weird value (almost as if it was demoted to an acquire load). – HolyBlackCat Aug 24 '22 at 07:04
  • @HolyBlackCat Can you explain in detail how to achieve that A actually happens before C but A is later in the single total order? Also, could A be reordered after B because the value of `x` is not actually read in thread 2? – Pluto Aug 24 '22 at 08:44
  • I don't know much about how this works on the hardware level. My understanding is that when you mix seq-cst and acq/rel operations this way, the seq-cst order becomes somewhat imaginary and detached from reality. I prefer to think of it in terms of the affected seq-cst operation being demoted to an acq/rel one, not participating in the seq-cst order, though this is might be a weaker constraint than what the standard mandates. – HolyBlackCat Aug 24 '22 at 16:01

1 Answers1

1

Note A and C operate on different objects, so it's meaningless to say "the side effects of A should be visible to C". If you mean the side effect of B is visible to C, then yes, and it does not conflict with the single total modification order C-E-F-A:

a memory_order_seq_cst load gets its value either from the last memory_order_seq_cst modification, or from some non-memory_order_seq_cst modification that does not happen-before preceding memory_order_seq_cst modifications.

xskxzr
  • 12,442
  • 12
  • 37
  • 77
  • Sorry, I still don't quite understand why *it's meaningless to say "the side effects of A should be visible to C"*. Can you explain it in detail? – Pluto Aug 24 '22 at 09:08
  • @Pluto C fetches the value of `y`, but the side effect of A is to write `x`. Why do you think they are related? – xskxzr Aug 25 '22 at 01:16
  • So does this mean that even though A 'happens before' C, since the value of `x` is not used in thread 2, A can happen after C? If the value of `x` is read after the C operation in thread 2, does A have to happen before C? – Pluto Aug 25 '22 at 01:46
  • If you mean "A can happen after C" by "A is after C in the total order", then yes. This is emphasized in the proposal and cppreference: "The single total order might not be consistent with *happens-before*". Thread 3 is an example where `x` is read. – xskxzr Aug 25 '22 at 01:57
  • 1
    @Pluto By the way, `r3`'s reading 0 is not the only result allowed by the standard. `r3` may read 1, which is stored by A. In this case, the total order can be A-C-E-F. So regarding your question "If the value of `x` is read after the C operation in thread 2, does A have to happen before C?", A don't **have to** be before C in the total order. It depends on hardware's implementation. – xskxzr Aug 25 '22 at 02:07