9

Code in question:

#include <atomic>
#include <thread>

std::atomic_bool stop(false);

void wait_on_stop() {
  while (!stop.load(std::memory_order_relaxed));
}

int main() {
  std::thread t(wait_on_stop);
  stop.store(true, std::memory_order_relaxed);
  t.join();
}

Since std::memory_order_relaxed is used here, I assume the compiler is free to reorder stop.store() after t.join(). As a result, t.join() would never return. Is this reasoning correct?

If yes, will changing stop.store(true, std::memory_order_relaxed) to stop.store(true) solve the issue?

Lingxi
  • 14,579
  • 2
  • 37
  • 93
  • These thread operations are optimization barriers, so no – Passer By May 22 '18 at 07:56
  • @PasserBy Could you provide some references? All I know so far is that the completion of `t` synchronizes with successful return from `t.join()`, which does not help much. – Lingxi May 22 '18 at 08:00
  • 1
    Since `stop` is a global variable, I believe the compiler will emit the code for `stop.store()` before the call to `t.join()`. In the other hand, I think the processor will be allowed to defer the visibility of the `store` operation. – Yann Droneaud May 22 '18 at 08:23
  • @YannDroneaud Till now, your comment is the only thing that makes sense to me. – Lingxi May 22 '18 at 11:37
  • Jeff Preshing has written [an article](http://preshing.com/20170612/can-reordering-of-release-acquire-operations-introduce-deadlock/) on his blog that is related to your question. The answer given by T.C. appears to be correct: atomic stores must become visible within a reasonable time. – LWimsey May 22 '18 at 14:07

2 Answers2

6

[intro.progress]/18:

An implementation should ensure that the last value (in modification order) assigned by an atomic or synchronization operation will become visible to all other threads in a finite period of time.

[atomics.order]/12:

Implementations should make atomic stores visible to atomic loads within a reasonable amount of time.

This is a non-binding recommendation. If your implementation follows them - as high-quality implementations should - you are fine. Otherwise, you are screwed. In both cases regardless of the memory order used.


The C++ abstract machine has no concept of "reordering". In the abstract semantics, the main thread stored into the atomic and then blocked, and so if the implementation makes the store visible to loads within a finite amount of time, then the other thread will load this stored value within a finite amount of time and terminate. Conversely, if the implementation doesn't do so for whatever reason, then your other thread will loop forever. The memory order used is irrelevant.

I've never found reasoning about "reordering" to be useful. It mixes up low-level implementation detail with a high-level memory model, and tends to make things more confusing, not less.

T.C.
  • 133,968
  • 17
  • 288
  • 421
  • But the store may not happen at all, if it is reordered after join. – Lingxi May 22 '18 at 08:58
  • There's no such thing as "reordering" in the abstract machine. – T.C. May 22 '18 at 09:46
  • Related: [Why set the stop flag using \`memory\_order\_seq\_cst\`, if you check it with \`memory\_order\_relaxed\`?](https://stackoverflow.com/q/70581645) has some discussion about the fact that inter-thread latency is a quality-of-implementation issue, C++ just compiles to asm loads and stores; it's hardware cache coherency that gives us low latency. – Peter Cordes Nov 24 '22 at 11:02
  • 1
    @Lingxi: That compile-time reordering would violate the as-if rule, creating a deadlock or infinite loop where one didn't exist in the source. In practice on real implementations, the compiler can't see the code for some library functions `.join()` calls, so it can't be sure it doesn't read your global `atomic_bool stop`. See [How C++ Standard prevents deadlock in spinlock mutex with memory\_order\_acquire and memory\_order\_release?](//stackoverflow.com/q/61299704) for more discussion about what in the standard forbids introducing deadlocks or other infinite loops with static reordering. – Peter Cordes Nov 24 '22 at 11:07
1

Any function whose definition is not available in the current translation unit is considered an I/O function. Such calls are assumed to cause side effects and the compiler cannot move following statements to precede the call or preceding statements to follow the call.

[intro.execution]:

Reading an object designated by a volatile glvalue ([basic.lval]), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression (or a subexpression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects. When a call to a library I/O function returns or an access through a volatile glvalue is evaluated the side effect is considered complete, even though some external actions implied by the call (such as the I/O itself) or by the volatile access may not have completed yet.

And

Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.

Here std::thread constructor and std::thread::join are such functions (they eventually call platform specific thread functions unavailable in the current TU) with side effects. stop.store also causes side effects (memory store is a side effect). Hence stop.store cannot be moved prior to std::thread constructor or past std::thread::join calls.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
  • Then how to explain [this](https://stackoverflow.com/a/37789799/1348273)? I guess `std::chrono::high_resolution_clock::now()` is the kind of function you are talking about. – Lingxi May 22 '18 at 11:32
  • @Lingxi There is already a comprehensive answer to that question. – Maxim Egorushkin May 22 '18 at 11:56