2

Let's say we have 2 threads. One producer and one consumer. We have the producer that produce a data, and the consumer that use this data. However the guard is not atomic !

bool isDataReady = false;
int data = 0;

void Producer() {
  data = 42;
  std::atomic_thread_fence(std::memory_order_release);
  isDataReady = true;
}

void Consumer() {
  while(!isDataReady);
  std::atomic_thread_fence(std::memory_order_acquire);
  assert(data == 42);
}

I wonder why is there a data race on isDataReady. Normally, the correct code should be to use relaxed ordering on an atomic bool variable.

Is it because the write (transaction) to isDataReady could be not finished before the read? And even if it is the case, is it really an issue?

Antoine Morrier
  • 3,930
  • 16
  • 37

1 Answers1

4

TL;DR

This data race is dangerous and you should care about eliminating it. It may not manifest due to your luck, but it will eventually cause a headache.

A bit longer

This code has problems due to a few issues:

  1. While compiling Consumer compiler is not aware that isDataReady can change in background so it is perfectly reasonable to emit while(!isDataReady) an infinite loop or just nothing (due to forward progress guarantee, as was pointed out in comments).

  2. If write and/or read to bool is not atomic (which is not the case on most platforms, yet is theoretically possible) any of reads can cause getting garbage data.

  3. Memory fence with std::memory_order_release ensures that changes that happend in the thread will be visible after other thread calls fence with std::memory_order_acquire (at least in simplification). So, the change of bool variable may be invisible in other thread.

  4. Due to superscalar architecture of modern processorss, operations may be reordered in runtime by processor. So order of memory writes in Producer visible from Consumer may be different than the one put in the code.

bartop
  • 9,971
  • 1
  • 23
  • 54
  • Because of the forward progress guarantee, it would be equally reasonable for the compiler to emit nothing at all rather than an infinite loop (I believe). – You Jan 09 '19 at 13:10
  • For 1), make `isDataReady` volatile fix the problem. 2) The volatile is not enough, and I need atomic operation. 3) The same, I need an atomic (since atomic operation are done in memory location), so the change of bool became visible in other thread. 4) Since I am using memory fencing, it is not really an issue. Am I correct? – Antoine Morrier Jan 09 '19 at 13:14
  • @AntoineMorrier AFAIK 1)y 2)y 3) and 4) yes, but you have to place memory fences properly. – bartop Jan 09 '19 at 13:18
  • @You I thought forward guarantee is important only for atomic, lock-free variables – bartop Jan 09 '19 at 13:19
  • @bartop In my code my memory fences are well placed, no? The "only issue" in my code is because of the data race on `isDataReady` – Antoine Morrier Jan 09 '19 at 13:19
  • 1
    @AntoineMorrier Yup, just move first memory fence line down, and the second inside `do{}while` line above and it will be all fine – bartop Jan 09 '19 at 13:21
  • @bartop: Not quite, I think: https://en.cppreference.com/w/cpp/language/memory_model#Progress_guarantee – You Jan 09 '19 at 13:22
  • @bartop I am reading this article http://www.modernescpp.com/index.php/acquire-release-fences . It seems that the atomic_thread_fence are placed exactly like I did. The only difference between his code and mine is its `isDataReady` is an atomic variable. In this example, the intesting things is to not put an acquire release operation inside a loop, else we could just use an atomic_variable with relase/acquire operation. Since release/acquire operation are heavier than relaxed, using `atomic_thread_fence` should result in faster performance – Antoine Morrier Jan 09 '19 at 13:29
  • 1
    @AntoineMorrier If you care about performance, you might be interested in non-blocking waiting (provided, e.g., by condition variables). Reading an atomic bool in a loop effectively blocks CPU (spin lock - busy waiting), which is useful only if you know that you cannot use that CPU (core) another way and that the waiting will be very short. – Daniel Langr Jan 09 '19 at 13:40
  • @AntoineMorrier BTW, to illustrate the endless-loop problem, the compiler really generates an infinite loop at the machine code level, see https://godbolt.org/z/PJztDp (`.L6: jmp .L6`). The only way to prevent it here would be to put the fence inside the loop. Or, better, to use `atomic`. Do not use `volatile` for what `atomic`s are for: https://stackoverflow.com/q/8819095/580083. – Daniel Langr Jan 09 '19 at 13:49
  • Re, "bool is not atomic ... reads can [get] garbage data." Are you saying that an assignment to a `bool` variable could be a compound operation that temporarily stores a value that is neither `true` nor `false` before it stores the final value? I suppose the language standard could _allow_ that behavior, but given how C++ was evolved from the "C" programming language, and given how conditional operations work in most CPU instruction architectures, I would be very surprised if any reasonable implementation actually worked in that way. – Solomon Slow Jan 09 '19 at 19:38
  • @AntoineMorrier, Re, "only difference...is its isDataReady is an atomic variable." That's not a trivial difference. The compiler must emit appropriate memory fence instructions in proximity to every use of an `atomic` variable. That's part of what `atomic` means. – Solomon Slow Jan 09 '19 at 19:45
  • @SolomonSlow i know that is not a trivial difference at all. I know the difference between an atomic and a non atomic variable. Ton only is maybe misused. But the discussion was about to put or not to put the fence inside the loop :). That is why I take the liberty to say "only" ;) – Antoine Morrier Jan 09 '19 at 20:41