For a weak ordering memory model such as ARM, how does the following code work?
Note: I'm aware the memory ordering is relevant only to multi-core threads, and that the code below runs only in one core, however still have the below questions.
std::atomic_bool a {false};
std::atomic_bool b {false};
void signal_handler(int)
{
if (b.load(std::memory_order_relaxed))
{
std::atomic_signal_fence(std::memory_order_acquire);
assert(a.load(std::memory_order_relaxed));
}
}
int main()
{
std::signal(SIGINT, &signal_handler);
a.store(true, std::memory_order_relaxed);
std::atomic_signal_fence(std::memory_order_release);
b.store(true, std::memory_order_relaxed);
}
The std::atomic_signal_fence
only guarantees ordering on the compiler.
Question 1
This is what I understand: The out of order execution can reorder the a.store()
and b.store()
, that is, it would be possible for the store buffer to contain b == true
, and a
still not being executed (still a == false
). If at this point the signal handler is run, then the assert would fail (reading the value from store buffer, store forwarding). I'm aware how this could not happen in an x86 since stores are not reordered, but not sure how this works on ARM. How does the cpu do to make sure the signal handler sees the code sequentially?
Question 2 In a multi-threaded code environment, if no memory barriers are used, can the store buffers being flushed to cache memory out of order? My assumption is that it could be flushed out of order, otherwise no memory barriers would be needed, however when I read the definition of of Out of Order execution it says "OoOE processors fill these "slots" in time with other instructions that are ready, then re-order the results at the end to make it appear that the instructions were processed as normal.". Not sure if the link is only referring to x86 that follows a strong memory model though.