How do changes (read/writes) to std::atomic variables propagate across threads

Question

I have asked this question recently do-i-need-to-use-memory-barriers-to-protect-a-shared-resource

To that question I got a very interesting answer that uses this hypothesis:

Changes to std::atomic variables are guaranteed to propagate across threads.

Why is this so? How is it done? How does this behavior fit within the MESI protocol ?

By default, atomic operations are sequentially consistent. The exact way in which this is accomplished varies from platform to platform. — T.C., Jan 24 '15 at 02:16
The compiler will spit out the necessary instructions to make it so, because the standard says it needs to happen. If you want to know how a particular platform implements this, you better specify which one. The combination of [all atomic operations] x [all memory orderings] x [all C++11 compilers] x [all CPU architectures] is quite long. — DanielKO, Jan 24 '15 at 14:12

score 4 · Accepted Answer · answered Feb 20 '15 at 01:09

They don't actually have to propagate, the cache coherency model (MESI or something more advanced) provides you a guarantee that the memory behaves coherently, almost as if it's flat and no cached copies exist. Sequential consistency adds to that a guarantee of the same observation order by all agents in the system (notice - most CPUs don't provide sequential consistency through HW alone).

If a thread does a memory write (not even atomic), the core it runs on will fetch the line and obtain ownership over it. Once the write is done, any thread that attempts to observe the line is guaranteed to see the updated value, even if the line still resides in the modifying core - usually this is achieved through snooping the core and getting the line from it as a response. The cache coherency protocols will guarantee that if such a modification exists locally in some core - any other core looking for that line is bound to see it eventually. To do that, the CPU my use snoop filters, directory management (often for cross socket coherency), or other methods.

Now, you're asking why is atomic important? For 2 reasons. First - all the above applies only if the variable resides in memory, and not a register. This is a compiler decision, so the correct type tells it to do so. Other paradigms (like open-MP or POSIX threads) have other ways to tell the compiler that a variable needs to be shared through memory. Second - modern cores execute operations out-of-order, and we don't want any other operation to pass that write and expose stale data. std::atomic tells the compiler to enforce the strongest memory ordering (through the use of explicit fencing or locking - check out the generated assembly code), which means that all your memory operations from all threads will be have the same global ordering. If you didn't do that, strange things can happen like core A and core B disagreeing on the order of 2 writes to the same location (meaning that they may see different final values in it).

Last, of course, is the actual atomicity - if your data type is not one that has atomicity guaranteed, or it's not properly aligned - this will also solve that problem for you (otherwise the coherency problem intensifies - think of some thread trying to change a value split between 2 cache lines, and different cores seeing partial values)

I understood almost everything in your answer except for the ordering bit. If we're talking about 1 atomic variable can you give me an example where ordering matters? For multiple shared variables ordering is a must and fences must be used buy for 1 atomic I can't see it — Kam, Feb 20 '15 at 02:40
@kam - Yes, ordering is relevant when you have more than one variable. But atomicity is by itself enough reason to use std::atomic (think what happens if 2 threads do `x++` at the same time - std::atomic makes this an atomic RMW) — Leeor, Feb 20 '15 at 09:52

How do changes (read/writes) to std::atomic variables propagate across threads

1 Answers1