In the below code the write to a in foo is stored in store buffer and not visible to the ra in bar. Similarly the write to b in bar is not visible to rb in foo and they print 00.
// g++ -O2 -pthread axbx.cpp ; while [ true ]; do ./a.out | grep "00"; done prints 00 within 1min
#include<atomic>
#include<thread>
#include<cstdio>
using namespace std;
atomic<long> a,b;
long ra,rb;
void foo(){
a.store(1,memory_order_relaxed);
rb=b.load(memory_order_relaxed);
}
void bar(){
b.store(1,memory_order_relaxed);
ra=a.load(memory_order_relaxed);
}
int main(){
thread t[2]{ thread(foo),thread(bar)};
t[0].join();t[1].join();
if((ra==0) && (rb==0)) printf("00\n"); // each cpu store buffer writes not visible to other threads.
}
The below code is almost the same as above except the variable b is removed and both foo and bar have the same variable 'a' and the return value is stored in ra1 and ra2. In this case i never get a "00" atleast after running for 5 minutes.
- In the second case why doesn't it print 00 ? How come writes to x are not stored in cpu cache for both threads and then print 00 ?
- Does it have anything to do with x86_64 but it prints 00 on arm/arm64/power ?
- If arm/arm64/power prints 00 , will a smp_mb() after store in foo and bar fix it ?
// g++ -O2 -pthread axbx.cpp ; while [ true ]; do ./a.out | grep "00"; done doesn't print 00 within 5 min
#include<atomic>
#include<thread>
#include<cstdio>
using namespace std;
atomic<long> a,b;
long ra1,ra2;
void foo(){
a.store(1,memory_order_relaxed);
ra1=a.load(memory_order_relaxed);
}
void bar(){
a.store(1,memory_order_relaxed);
ra2=a.load(memory_order_relaxed);
}
int main(){
thread t[2]{ thread(foo),thread(bar)};
t[0].join();t[1].join();
if((ra1==0) && (ra2==0)) printf("00\n"); // each cpu store buffer writes not visible to other threads.
}