0

I want to implement the following

  1. Two threads: producer (can change shared variable a), consumer (waiting for shared variable change)
  2. I want to use atomic to synchronize threads

Problem - I was running my code with int a; and std::atomic<int> a; but in 100% cases I have the same result on my PC.

calculate_val 
process 1

How can I see changes of code with and without atomics?

using namespace std;

atomic<int> a; // I was replacing with int a; and always have the same result
int val;

int calculate_val() {
    cout << "calculate_val " << endl;
    return 1;
}

void process(int val) {
    cout << "process " << val << endl;
} 

void threadProducer() {
    val = calculate_val();
    a = 1;
}

void threadConsumer() {
    while (!a) {
        
    }
    process(val);
}




int main() {
    thread t2(threadConsumer);
    thread t1(threadProducer);

    t2.join(); // was trying to change order
    t1.join();
    return 0;
}
mascai
  • 1,373
  • 1
  • 9
  • 30
  • Why do you expect it to be different? You literally say that `threadConsumer()` shouldn't call `process()` while `a` is not set. – lorro Feb 28 '23 at 23:01
  • Making `a` plain `int` exhibits undefined behavior by way of a data race. "Seems to work" is one possible manifestation of undefined behavior. I wonder whether you are compiling with optimizations enabled; an optimizer could remove `while (!a) {}` loop entirely, since in a conforming program `a` cannot change. – Igor Tandetnik Feb 28 '23 at 23:15
  • @lorro: Without atomics or locking (the I/O introduces locking you don't see, which does mess with things), the write to `a` could be seen before the write to `val`. Threading is weird. – ShadowRanger Feb 28 '23 at 23:16
  • With gcc and -O3 specified, your loop is optimized out, see: https://godbolt.org/z/94YPxPbhh – Fabian Keßler Feb 28 '23 at 23:44

1 Answers1

2

Your scenario is too simple. On an architecture with strongly ordered memory semantics (e.g. x86 and descendants, which, given you refer to "my PC", is probably what you're using), with compiler optimizations turned off, as long as the compiler doesn't reorder threadProducer's a = 1; ahead of the function call (it has no reason to), doesn't cache the value of a for the while (!a) test, and doesn't eliminate the while(a){} loop, the atomics won't make a difference.

Your existing code could fail without atomics:

  1. On a machine with weakly ordered memory semantics (the writes to a and val could become visible in threadConsumer in any order, and it could take an arbitrarily long for them to become visible). I suspect the locking built-in to the iostream library, and the C stdio it wraps, would probably trigger cache flushes though, so your I/O trying to observe the races is in fact changing the behavior.
  2. On any machine if the compiler chose to cache the value of a in threadConsumer for efficiency (non-volatile, non-atomic values can be cached to save overhead because it's assumed they're only accessed from one thread, and the loading thread isn't changing it).
  3. On any machine if the compiler decides to eliminates the while (a) {} loop, due the non-atomic a potentially creating an infinite loop if cached as suggested in #2.
  4. On any machine if the compiler chose to reorder a = 1; ahead of val = calculate_val(); (since neither line depends on any values computed in the other line, an optimizing compiler could choose to do them in either order). In practice, given the aforementioned locking in the I/O layers, the compiler won't swap these lines, but it could if the I/O were removed.

I'm not going to give an example of things that could go wrong, because a cursory search will find all sorts of examples of when and why you use atomics. What's important to know is that, even in your simple scenario where you thought it was safe without them, it wasn't. Even turning up optimizations might cause your code with a non-atomic a to stop behaving on x86-64 machines, and x86-64 is already threading/atomics on easy mode.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • Not only the compiler is allowed to reorder it, the processor is also allowed to reorder instructions. When the function calculate_val is complicated and may have some optimized microcode instructions with computation holes in the pipeline, it is very likely, that the assignment is done in between. – Fabian Keßler Feb 28 '23 at 23:29
  • @FabianKeßler: Yep. I don't believe x86-64 will ever do this (the strongly ordered memory model means even if it does the work in the pipeline out of order, it's not observable in any reasonable way), but on another architecture, yep, reordering is a thing. – ShadowRanger Feb 28 '23 at 23:52