How does atomic synchronization work on a single thread when it gets migrated to another core

Question

Asking this question as a pseudo code, and also targeting both rust and c++ as memory model concepts are ditto

SomeFunc(){
    x = counter.load(Ordering::Relaxed)   //#1
    counter.store(x+1, Ordering::Relaxed) //#2
    y = counter.load(Ordering::Relaxed)   //#3
}

Question: Imagine SomeFunc is being executed by a thread and between #2 and #3 the thread gets interrupted and now #3 executes on different core, in this case does counter variable get synchronized with the last updated value (core 1) when it runs on another core2 (there is no explicit release/acquire). I suppose the entire cache line+thread local storage gets shelved and loaded when the thread briefly goes to sleep and comes back running on different core?

Premption, context switching and other such CPU mechanisms are transparent to C++, as long as you respect synchronization requirements. In the context of a single threaded function call there are no synchronization requirements and switching cores has no observable effect. — François Andrieux, Feb 09 '23 at 16:11
This is all within a single thread, so there are no synchronization issues; it will just do what it obviously does. Even if `counter` is just a plain old `int`, not atomic. The CPU takes care of managing context for each thread. It's only when a variable is used by more than one thread that you have to worry about synchronization. — Pete Becker, Feb 09 '23 at 16:56

Matthieu M. · Accepted Answer · 2023-02-09T18:36:33.817

4

First of all, it should be noted that atomic instructions add synchronization, and do not remove it.

Would you expect:

unsigned func(unsigned* counter) {
    auto x = *counter;
    *counter = x + 1;
    auto y = *counter;
    return y;
}

To return anything else than the original value of *counter + 1?

Yet, similarly, the thread could be moved between cores in-between two statements!

The above code executes fine even when the core is moved because the OS takes care during the switch to appropriately synchronize between cores to preserve user-space program order.

So, what happens when using atomics on a single thread?

Well, you add a bit of processing overhead -- more synchronization -- and the OS still takes care during the switch to appropriately synchronize.

Hence the effect is strictly the same.

edited Feb 09 '23 at 18:36

answered Feb 09 '23 at 17:19

Matthieu M.

287,565
48
449
722

*appropriately manage the caches.* It's not caches that are the problem, it's stuff like the private store buffer inside each core. [Cache is coherent between cores that a single OS is running on and can schedule threads to.](https://stackoverflow.com/a/58535118/224132) What the kernel actually needs is to "take care of synchronization", probably with acquire/release synchronization which it needed for its own kernel data anyway. (Or something stronger for machines like x86 that have special weakly-ordered instructions like `movntps` that aren't ordered by atomic release / acquire.) – Peter Cordes Feb 09 '23 at 17:35
So to avoid spreading the common misconception that memory reordering is due to caches, I'd recommend "the OS takes care ... to appropriately synchronize". A better mental model is local reordering of accesses to coherent shared cache, as in https://preshing.com/20120710/memory-barriers-are-like-source-control-operations/. That's why IRIW reordering is so rare; it requires [a microarchitecture that can make stores visible between (logical) cores before they commit to cache](https://stackoverflow.com/a/50679223/224132). I'll edit, you can of course re-edit to whatever you want to say. – Peter Cordes Feb 09 '23 at 17:37
1

@PeterCordes: That's a good point, I was specifically thinking of the store buffer as I write, and used cache as a generic term for read/write without thinking that in the context of CPUs cache would be understood as referring to L1/L2/L3. Thanks for the edit! – Matthieu M. Feb 09 '23 at 18:37

How does atomic synchronization work on a single thread when it gets migrated to another core

1 Answers1