1

Consider this example from cppreference

#include <thread>
#include <atomic>
#include <cassert>
 
std::atomic<bool> x = {false};
std::atomic<bool> y = {false};
std::atomic<int> z = {0};
 
void write_x()
{
    x.store(true, std::memory_order_seq_cst);  // #1
}
 
void write_y()
{
    y.store(true, std::memory_order_seq_cst); // #2
}
 
void read_x_then_y()
{
    while (!x.load(std::memory_order_seq_cst))  // #3
        ;
    if (y.load(std::memory_order_seq_cst)) {  // #4
        ++z;
    }
}
 
void read_y_then_x()
{
    while (!y.load(std::memory_order_seq_cst))  // #5
        ;
    if (x.load(std::memory_order_seq_cst)) {  // #6
        ++z;
    }
}
 
int main()
{
    std::thread a(write_x);
    std::thread b(write_y);
    std::thread c(read_x_then_y);
    std::thread d(read_y_then_x);
    a.join(); b.join(); c.join(); d.join();
    assert(z.load() != 0);  // will never happen
}

In order to prove the assert will never happen, we should prove either y-store happens before y-load-in-if, or x-store happens before x-load-in-if, or both. They will guarantee the value of z should be at least greater than 0, which makes the assert never happen, as per the below rules.

The value of an atomic object M, as determined by evaluation B, shall be the value stored by some side effect A that modifies M, where B does not happen before A.

If an operation A that modifies an atomic object M happens before an operation B that modifies M, then A shall be earlier than B in the modification order of M.

If a value computation A of an atomic object M happens before a value computation B of M, and A takes its value from a side effect X on M, then the value computed by B shall either be the value stored by X or the value stored by a side effect Y on M, where Y follows X in the modification order of M.

If a value computation A of an atomic object M happens before an operation B that modifies M, then A shall take its value from a side effect X on M, where X precedes B in the modification order of M.

If a side effect X on an atomic object M happens before a value computation B of M, then the evaluation B shall take its value from X or from a side effect Y that follows X in the modification order of M.

However, we can only have these conditions

1. x-store synchronizes x-load-in-while
2. y-store synchronizes y-load-in-while
3. x-load-while is sequenced before y-load-in-if
4. y-load-while is sequenced before x-load-in-if 

According to [intro.races] p10, bullet 1 and 3 give:

x-store inter-thread happens before y-load-in-if

bullet 2 and 4 give:

y-store inter-thread happens before x-load-in-if

Then, I cannot find any association that can prove y-store happens before y-load-in-if, or x-store happens before x-load-in-if. The special for this case is they are all seq operations, so there is also an extra rule, called single total order S. Even with that rule, I can only conclude the following things:

1. x-store < x-load-in-while < y-load-in-if, in S
2. y-store < y-load-in-while < x-load-in-if, in S

The symbol < means precedes. Apart from these conclusions, I cannot get more information that can interpret why the assertion will never happen. How to correctly use the current C++ standard to interpret this case?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
xmh0511
  • 7,010
  • 1
  • 9
  • 36
  • You do understand *informally* that seq_cst guarantees that all threads agree about the order of any two stores, right, due to the existence of a single total order of all atomic operations even on different objects? (And that `acq_rel` and weaker don't: [Will two atomic writes to different locations in different threads always be seen in the same order by other threads?](https://stackoverflow.com/q/27807118) / [Acquire/release semantics with 4 threads](https://stackoverflow.com/q/48383867)) So program behaviour is explainable as some interleaving of source order (if you only use SC ops.) – Peter Cordes Mar 03 '23 at 10:09
  • @PeterCordes I look forward to the formal interpretation of the case in question. As I said in the post, I tried to interpret the result from two directions, one is "happens before" and another is the single total order. However, I cannot prove the assertion will never happen in such two directions, at least, I cannot establish the connection between `store` and `load-in-if` in such two directions. – xmh0511 Mar 03 '23 at 11:09
  • 1
    I've written some memory model proofs on previous questions. I don't think any of them address this exact example, but they may give you some ideas on how you could structure a proof. https://stackoverflow.com/questions/74013717/is-the-implementation-of-the-hazard-pointer-in-c-concurrency-in-action-flawed/74023184#74023184, https://stackoverflow.com/questions/73356944/changing-code-from-sequential-consistency-to-a-less-stringent-ordering-in-a-barr/73383672#73383672, https://stackoverflow.com/questions/70577560/are-seq-cst-fences-exactly-the-same-as-acq-rel-fences-in-absence-of-seq-cst-load/ – Nate Eldredge Mar 03 '23 at 15:33
  • Typically, as Peter said below, a proof by contradiction works well. You begin by assuming that the "bad result" occurs, and work out what values must be observed by which loads in order to cause it. For sequential consistency, you look at the coherency-order rules [atomics.order p3] to deduce which operations must have been coherency-ordered before each other. Then you look at p4 to make deductions about how those operations must have been ordered in the total order S. – Nate Eldredge Mar 03 '23 at 15:38
  • Eventually you end up with a contradiction, such as two entries in S that each precede the other, which is contrary to the definition of "total order". – Nate Eldredge Mar 03 '23 at 15:38
  • @NateEldredge So, what do I need to assume here? Assume `#4` happens before `#2` and `#6` happens before `#1`? – xmh0511 Mar 03 '23 at 15:51
  • 1
    The above assumption may a proof by contradiction. However, if we assume `#4` reads the initial value `false` and `#6` reads the initial value `false`, it seems we cannot get the order of `#2` and `#1` in S. – xmh0511 Mar 03 '23 at 16:02
  • @xmh0511: duck's answer is what I had in mind. You show that if #4 returns false then #1 must precede #2 in S. And if #6 returns false then #2 precedes #1. If they both return false, then #1 and #2 both precede each other, a contradiction. – Nate Eldredge Mar 04 '23 at 03:07
  • @NateEldredge Another confusion is when we say `#4` reads `false`, which bullet says the initial write precedes `#2` in the modification order of `y`? [atomics.order]/3.3 requires: *and X precedes B in the modification order of M* – xmh0511 Mar 04 '23 at 03:28
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/252293/discussion-between-nate-eldredge-and-xmh0511). – Nate Eldredge Mar 04 '23 at 03:35

3 Answers3

5

Both ++z (at #4 and #6) happen before z.load() in main ([thread.thread.member]/4, [intro.races]/9, /10.2), if they happen at all.

These side effects are necessarily visible to the load ([intro.races]/18), so the only way for the assert to fire is for both #4 and #6 to read false.

In read_x_then_y, #3 reads true written by #1, meaning #1 is coherence-ordered before #3 ([atomics.order]/3.1), and #3 strongly happens before the load of y at #4 ([intro.races]/12.1).

#4 reading false would make it coherence-ordered before the write of true at #1 ([atomics.order]/3.3, where X is the initial 'write' of false).


Few points of clarification here:

  • The first part of [atomics.order]/3.3 simply states that A and B must be different atomic operations, it does not limit the scope of this bullet to just read-modify-write ones. P0668, the paper that introduced the wording, summarizes:

    The third bullet corresponds to what's often called "reads before": A reads a write earlier than B.

  • Initialization of an object not technically being a side effect is the subject of CWG2587, and is not unique to atomics. I think it is clear enough that the intent here is that these rules include initialization when talking about side effects.

  • (Constant) initialization of y happening before #2 is another point that the standard (by my reading, at least) fails to explicitly specify, but we can once again apply common sense: if it doesn't happen before #2, this program has undefined behavior because it accesses an out-of-lifetime object ([res.on.objects]/2), and any further analysis is meaningless. That would of course be absurd, so we'll assume that #2 does in fact happen after y is initialized, and as such it follows the initialization in the modification order of y ([intro.races]/15).


This analysis gives us the ordering #1 < #3 < #4 < #2 in the total order S of seq_cst operations ([atomics.order]/4) on an execution where #4 in read_x_then_y reads false.

Repeating the same for read_y_then_x (once again assuming #6 gets false) yields #2 < #5 < #6 < #1.

In the first case, #1 precedes #2 in S, in the second case, #2 precedes #1. Clearly, there's no possible S that is consistent with both of these, meaning #4 and #6 can't both read false on any possible execution of this program. Quoting P0668 again:

If these constraints are not satisfiable by any total order S, then the candidate execution which gave rise to the coherence order is not valid.

duck
  • 1,455
  • 2
  • 8
  • The conclusion `#1 < #3 < #4 < #2` is based on [atomics.order]/3.3 can work, however, neither `store` nor `load` are read-modify-write operations. p3.3 says *A and B are not the same atomic read-modify-write operation*, does it mean `A` and `B` should be read-modify-write operation but not the same, or `A` and `B` can be other kind of operation that is not read-modif-write operation, or both? – xmh0511 Mar 04 '23 at 02:00
  • @xmh0511: I thought at first that 3.3 might only be talking about atomic RMWs, but after reading the other 3 cases it seems clear they don't. 3.3 does mean that A can be a load or an RMW, and B can be an RMW or a pure store. A load and a store are "not the same RMW". 3.1 covers the case when a load *does* see the value from a store, 3.3 covers the case where it doesn't. (3.2 is store order of two modifications of the same object. And 3.4 makes it transitive or whatever you call that.) (@ duck, you might want to expand your answer to clear up that point of possible confusion.) – Peter Cordes Mar 04 '23 at 02:17
  • `If #4 reads false, it is coherence-ordered before #2 ([atomics.order]/3.3, with X being the initial 'write' of false)`, however, which bullets says the initial `write` of false precedes `#2` in the modification order of `y`? – xmh0511 Mar 04 '23 at 03:32
  • @xmh0511: Hmm, that is problematic; the constructor isn't an **atomic** modification of M. That might be a defect in the standard unless there's other language somewhere else about that term. The constructor's write of the initial value isn't guaranteed atomic; it's not safe for other threads to read an `atomic` until after its constructor returns. (In this case it's a constant initializer for an object with static storage-class, so no run-time initialization is necessary in a typical implementation where there's no other member of an instance. But the standard doesn't guarantee that.) – Peter Cordes Mar 04 '23 at 10:06
  • The threads started by `main` synchronize-with the point where they were started ([Does spawning a thread provide memory order guarantees on its own?](https://stackoverflow.com/q/60423427) although the standardese wording that gives that guarantee seems fairly opaque to me). So values from global initializers are visible (happen-before) anything in any of the threads. And yes, it's obvious that the standard intended to include initial values as the first element of the modification order, glad there's already an issue for that defect. – Peter Cordes Mar 04 '23 at 19:24
  • Also: *All static initialization strongly happens before any dynamic initialization.* https://timsong-cpp.github.io/cppwp/n4868/basic.exec#basic.start.static-2 – Peter Cordes Mar 04 '23 at 19:59
  • 1
    What I was referring to is that the standard does not seem to explicitly specify that static initialization happens before `main`; only that it strongly happens before any dynamic initialization (which could be deferred, and does not appear in this program at all). – duck Mar 04 '23 at 20:03
  • Oh right, that's talking about dynamic init of static storage; the local vars in main are separate. Surely there must be some language about globals being initialized before main, unless it's so obvious they forgot to define it :P – Peter Cordes Mar 04 '23 at 20:06
1

Consider each of the two possibilities, x-store < y-store in S, or y-store < x-store in S. One or the other of those must be true, because both stores are part of the total order of all SC operations.

For each case, one of the read functions will provably increment z due to the order of the loads in S. (And the other function might also happen to increment; the second load could happen after both stores have become visible.)

Syncs-with is irrelevant, there are no earlier operations in write_x or write_y for us to see.


Informally, the existence of a single total order of all atomic operations (even on different objects) is what guarantees that all threads agree on the order of two independent stores; this is an IRIW litmus test. (Independent Readers, Independent Writers).

A C++ program that's data-race free and uses seq_cst for all its atomic operations behaves in a way that's explainable as some interleaving of source order. i.e. sequentially consistent execution of data-race free programs, SC-DRF.

Fun fact: in CPU memory models, only the weakest models actually allow IRIW "reordering" and real hardware that actually does it (regardless of what's allowed on paper) is even rarer. e.g. ARMv8 disallows it but otherwise is quite weak; ARMv7 allowed it but no ARM hardware ever did it. Having all cores agree on a store order is vastly weaker than sequential consistency, and happens automatically if store data only becomes visible to other cores via coherent cache.

But in C++, only seq_cst forbids it; if you'd used acq_rel (or release for the stores and acquire for the loads), z == 0 would be allowed.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Thanks, but you still didn't interpret the case in a language-layer way. – xmh0511 Mar 03 '23 at 12:47
  • @xmh0511: Your reply to my comment under the question made me think this was something you were missing, since you didn't confirm an understanding that the existence of a total order of all loads and stores would be contradicted by an execution that resulted in z == 0. Formally, you'd do a proof by contradiction, showing that an execution where both `if`s are false is incompatible with both orderings in the total order, so it can't happen. – Peter Cordes Mar 03 '23 at 13:01
  • Is it possible that an x-store and x-load-in-if, and a y-store and y-load-in-if simultaneously happen? If it is not, do you mean the load and store should be ordered in the total order, that is either precedes the other or follows the other, which is not like the modification order of a single atomic object that only orders modifications, which means a load operation can simultaneously happen with store operation for the same atomic object? – xmh0511 Mar 03 '23 at 13:14
  • @xmh0511: Yes, each operation including loads is part of the total order S. A load either sees the value (after in the total order) or not (before in the total order). As for non-SC loads being simultaneous with a store, IDK what that's supposed to mean. It either sees the value from a store (after it) or not (seeing the value from an earlier operation in the modification order of the object being read). – Peter Cordes Mar 03 '23 at 13:54
  • [intro.races] p14 says *The value of an atomic object M, as determined by evaluation B, shall be the value stored by some side effect A that modifies M, where B does not happen before A.* I don't know what the opposite of "B does not happen before A" should say. It may include "B happens before A" and unordered? I'm not sure. – xmh0511 Mar 03 '23 at 15:59
  • @xmh0511: That wording is ruling out a some of the side-effects on M as being possible values for B. Other factors may still narrow it further, and ISO C++ doesn't require that repeated runs on real hardware must randomly produce every value that the standard allows. But yes, candidate values not excluded include ones where A happen before B, and ones where nothing orders one wrt. the other, no synchronization between threads. So yes, the opposite of "B does not happen before A" is "B does happen before A", those are the A values ruled out by this, where the load happens-before the store. – Peter Cordes Mar 03 '23 at 18:45
  • Even though I prove "x-load-in-if happens before x-store and y-load-in-if happens before y-load" cannot happen, I cannot exclude the case where x-load-in-if and x-store are unordered(i.e they have no synchronization), the same is true for y-load-in-if and y-store. Even though all `seq_cst` operations must have an order in the single total `S`, however, it cannot say one preceding the other in S would happen before the other because all rules say one happening before the other should precede the other in S, but not the other way around. – xmh0511 Mar 04 '23 at 00:16
  • @xmh0511: That's correct, being earlier in S isn't a happens-before. There must be different rules somewhere that say something about "earlier in S" for loads seeing values from stores; I'd have to check the standard's wording. That is after all the whole point of sequentially-consistent execution. – Peter Cordes Mar 04 '23 at 00:18
  • I haven't found that rule. Please cite the rule here if you find it. I think "earlier in S" for loads seeing values from stores" is significant to make the proof by contradictory meaningful for this case. – xmh0511 Mar 04 '23 at 00:24
  • @xmh0511: See duck's answer: a load that doesn't see the value from a store is "coherent-ordered before" that store, regardless of whether that was guaranteed by synchronization or was just a coincidence of timing. And S has to be consistent with coherence-ordered. – Peter Cordes Mar 04 '23 at 01:22
  • But, `load` and `store` are not read-modify-write operations. p3.3 requires *A and B are not the same atomic read-modify-write operation*, I'm not sure what it exactly means. Does it mean A and B **should only be read-modify-write operation** but not the same, or A and B can be another kind of operation that is not read-modify-write operation, or both? – xmh0511 Mar 04 '23 at 02:02
  • So, with your clarification of p3.3 in the comment on Duck's answers. I give my thought... – xmh0511 Mar 04 '23 at 02:35
  • Express conveniently, x-store `XA`, x-load-in-if `XB`, y-store `YA`, y-load-in-if `YB`. The opposite of "XB happens before XA and YB happens before YA" is "XB does not happen before XA, or YB does not happen before YA", you just prove the former case is not possible by checking whether `S` is contradictory, However, we should also check whether the opposite case can make assertion never happen or not. After all, "XB does not happen before XA": includes `XA` happens before `XB` and `XA` is unordered with `XB`, the same is true for `YA` and `YB`... – xmh0511 Mar 04 '23 at 02:36
  • IIUC, the unordered case should only include two possibilities: 1. load sees the value of the store, 2. the stored value is not seen. Eventually, we should prove: 1. `XA` happens before `XB`, `YA` happens before `YB`, 2. `XA` happens before `XB`, `YA` is unordered with `YB`, 3. `XA` is unordered with `XB`, `YA` happens before `YB`, 4. `XA` is unordered with `XB`, `YA` is unordered with `YB`. And check which cases can make `S` true. This is the proof from its front. Then for each unordered case, we will talk about the cases(see the value or not)... – xmh0511 Mar 04 '23 at 02:37
  • Finally, we can find that, only z must at least be 1 can `S` be a meaningful total order. I want to say "XB happens before XA and YB happens before YA" is not possible, which does not always mean "XB does not happen before XA, or YB does not happen before YA" can never make an assertion happen. We should direct prove it cannot, right? – xmh0511 Mar 04 '23 at 02:48
  • @xmh0511: `S` doesn't create happens-before synchronization, IIRC, so I think you can only formally prove that either XA is *coherence-ordered before* XB, or same for YA / YB. I'm not sure you get "happens before" out of this reasoning, not in the technical formal sense it's used in the C++ standard. However, since all the operations are seq_cst, https://eel.is/c++draft/atomics.order#6 isn't a problem; any happens-before that exists *will* be consistent with `S`, unlike if we'd had some happens-before relationship between `release` / `seq_cst` or `seq_cst` / `acquire` pairs. – Peter Cordes Mar 04 '23 at 02:49
  • I just meant, say "XB happens before XA and YB happens before YA is not possible" cannot prove other cases will make the assertion never happen. We should also prove the assertion will never happen in **all possible cases**. – xmh0511 Mar 04 '23 at 02:55
  • Another thing I think we should clarify is when we say `#4` reads `false`, which bullet says the initial write precedes `#2` in the modification order of `y`? Since p3.3 requires *and X precedes B in the modification order of M* – xmh0511 Mar 04 '23 at 03:29
  • @xmh0511: Like I said, happens-before is only useful in ordering #1 before #2 in `S`, and same for #3 before #4. The rest of the proof relies on a contradiction between `S` and coherence-ordered for an execution where neither thread increments `z`. – Peter Cordes Mar 04 '23 at 03:31
  • @xmh0511: You mean the initializer during construction of `std::atomic y = {false};`, which is not an atomic operation at all? Interesting question, yeah I guess that must count as the first element of its modification order. – Peter Cordes Mar 04 '23 at 03:34
  • "The rest of the proof relies on a contradiction between S and coherence-ordered for an execution where neither thread increments z.", *neither thread increments z* is based on the assumption: `YB` happens before `YA` and `XB` happens before `XA`, according to [intro.races] p14 isn't it? Such that all `load` cannot see the stored value. My thought is we can prove the assertion never happen by listing all possible assumptions, as listed in the above comment. – xmh0511 Mar 04 '23 at 03:39
  • @xmh0511: Stop trying to prove "happens before" or not. You only need to prove "coherence-ordered before", either `YA` before `YB` or `XA` before `XB`. which is something you can do based on `S`. Or that you can't have both `YB` coherence-ordered before `YA` and `XB` before `XA`. – Peter Cordes Mar 04 '23 at 03:43
  • @xmh0511: Re: reading the initializer instead of a store or RMW value: that's probably why [intro.races]/18 (https://eel.is/c++draft/intro.multithread#intro.races-18) write-read coherence talks about "a side effect X on an atomic object M", *not* specifying that the side effect has to be an atomic operation. So it can include construction of the object. We can't say that a read is coherence-ordered after the constructor, because that concept only applies to pairs of atomic operations, but we can say that the constructor happened-before the reader thread. (constructed before thread start) – Peter Cordes Mar 04 '23 at 03:45
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/252294/discussion-between-xmh0511-and-peter-cordes). – xmh0511 Mar 04 '23 at 03:47
1
  1. According to this:

3 An atomic operation A on some atomic object M is coherence-ordered before another atomic operation B on M if
(3.1)A is a modification, and B reads the value stored by A, or
(3.2)A precedes B in the modification order of M, or
(3.3)A and B are not the same atomic read-modify-write operation, and there exists an atomic modification X of M such that A reads the value stored by X and X precedes B in the modification order of M, or
(3.4) — there exists an atomic modification X of M such that A is coherence-ordered before X and X is coherence-ordered before B.

we need the following coherence-ordered before relations if we want x-load-in-if and y-load-in-if to return false:

  • for x: x-load-in-if -> x-store -> x-load-in-while
  • for y: y-load-in-if -> y-store -> y-load-in-while
  1. According to this:

12 An evaluation A strongly happens before an evaluation D if, either
(12.1)A is sequenced before D, or
(12.2)A synchronizes with D, and both A and D are sequentially consistent atomic operations ([atomics.order]), or
(12.3) — there are evaluations B and C such that A is sequenced before B, B simply happens before C, and C is sequenced before D, or
(12.4) — there is an evaluation B such that A strongly happens before B, and B strongly happens before D.
[Note 11: Informally, if A strongly happens before B, then A appears to be evaluated before B in all contexts. Strongly happens before excludes consume operations. — end note]

these are the strongly happens before relations for our example:

  • x-store -> x-load-in-while -> y-load-in-if
  • y-store -> y-load-in-while -> x-load-in-if
  1. According to this:

4 There is a single total order S on all memory_­order​::​seq_­cst operations, including fences, that satisfies the following constraints. First, if A and B are memory_­order​::​seq_­cst operations and A strongly happens before B, then A precedes B in S. Second, for every pair of atomic operations A and B on an object M, where A is coherence-ordered before B, the following four conditions are required to be satisfied by S:
(4.1) — if A and B are both memory_­order​::​seq_­cst operations, then A precedes B in S; and
...

we need to put all the operations in a single total order S, while preserving both:

  • coherence-ordered before relations:
    • x-load-in-if -> x-store -> x-load-in-while
    • y-load-in-if -> y-store -> y-load-in-while
  • and strongly happens before relations:
    • x-store -> x-load-in-while -> y-load-in-if
    • y-store -> y-load-in-while -> x-load-in-if

Let's first look at strongly happens before relations:

  • no matter how exactly you put the operations in a single total order S, either y-load-in-if or x-load-in-if will be the last operation
  • if x-load-in-if will be the last operation, then x-store -> x-load-in-if in S, but according to coherence-ordered before it should be x-load-in-if -> x-store in S => contradiction.
    The same logic works when y-load-in-if is the last operation in S.

As a result, executions where both x-load-in-if and y-load-in-if return false (and z=0) aren't allowed in C++.

вася
  • 11
  • 1
  • If I change `x-load-in-if` and `y-load-in-if` to `memory_­order​::​acquire`, that is, I deliberately cancel the orders that `y-load-in-while` precedes `x-load-in-if`, and `x-load-in-while` precedes `y-load-in-if`, in S, how do you analyze such a case? I want to know the difference after the change. – xmh0511 Mar 04 '23 at 09:00
  • @xmh0511: `y-load-in-while` is still sequenced-before `x-load-in-if`, so it still "strongly happens before". But `x-load-in-if(acquire)` isn't part of `S`, so its coherence ordering is not constrained to be consistent with strongly-happens-before. IRIW reordering is allowed, the assert can fail if both (or either?) `if` loads are `acquire`. (Making the `while` spin-loop loads `acquire` instead of `seq_cst` would I think not let the assert fail. The stores and the load-in-ifs still have to go into `S`, which rules out coherence-orders where each load-in-if is before its respective store.) – Peter Cordes Mar 04 '23 at 19:54
  • @PeterCordes *Making the while spin-loop loads acquire instead of seq_cst would*, in this case, we can get two conclusions: 1. `x-store` precedes `y-load-in-if` in S, 2. `y-store` precedes `x-load-in-if` in S. If we say `x` loads false and `y` loads false, according to *coherence-ordered before*(all seq_cst), we will get `x-store` < `y-load-in-if` < `y-store`< `x-load-in-if` < `x-store`, which is contradictory, so the assertion won't fail. However, see the last example in [reference](https://en.cppreference.com/w/cpp/atomic/memory_order), it says: Any other ordering may trigger the assert – xmh0511 Mar 05 '23 at 01:23
  • @xmh0511: "Any other ordering" might be talking about changing the `memory_order` parameter for *all* operations, not just for one of them. Or the "may trigger" could be talking about the fact that some of them could be changed but others can't, if we're analyzing this correctly. (As well as the fact that most real implementations can't in practice ever fail the assert, only some POWER CPUs.) Or whoever wrote it might not have realized that the spin-wait loads can be `acquire`, again if we're not making any mistakes here. – Peter Cordes Mar 05 '23 at 01:30
  • @PeterCordes So, I keep looking up a formulization way to do the proof. I realize it is easy to be wrong or omit some other possibilities that would make the assertion fail, if we base on some assumptions for which we thought they are all possibilities. – xmh0511 Mar 05 '23 at 04:01