Ordering of atomics with `memory_order_seq_cst`

Question

My reading of the C11 spec with regards to atomic operation ordering suggests that memory_order_seq_cst applies to operations on a specific atomic object.

Mostly, the descriptions are of the form "If a A and B are applied to M, then the order is maintained on M"

My question is specifically what happens if we have two operations that apply to different atomic objects. Something like the following:

atomic_store(&a, 20);
atomic_store(&b, 30);

where a and b are atomic (volatile) types (and atomic_store implies memory_order_seq_cst).

This problem is relevant to a memory mapped situation where the memory map represents the registers of some peripheral.

It's perfectly normal to have requirements about the ordering of the write. Let's say a = 20 is setting up the target for our missile peripheral and setting b = 30 is the launch command. Clearly, we don't want to launch until the missile is targeted properly.

If it makes a difference to anything, this is on ARM Linux with GCC.

If `a` and `b` are declared volatile, then [C11 5.1.2.3p6](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf#32) says, effectively, that a C compiler must ensure the modifications occur in that order (if done in a single thread). It does not matter whether `atomic_store()` does it atomically or not; it is enough that the compiler knows it (may) modify the value the specified pointer points to (so the first parameter cannot be a pointer to a const value). — Nominal Animal, Mar 29 '17 at 11:48
It's answers like [this one](http://stackoverflow.com/a/1787503/709852) that are guiding my queries on this issue. I'm not sure I know much in general about what the CPU is or isn't doing. I suppose the question is a general one about enforcing memory access constraints from pure C (without needing to invoke architecture specific code). — Henry Gomersall, Mar 29 '17 at 12:04
No, [`volatile` does not imply a memory fence](http://stackoverflow.com/questions/26307071/does-the-c-volatile-keyword-introduce-a-memory-fence). So it cannot be used for inter-thread communication. — Sander De Dycker, Mar 29 '17 at 12:04
So @SanderDeDycker without an explicit memory fence of some description, the only guarantee I have on the ordering is based on the non caching of certain memory regions. Is this correct? — Henry Gomersall, Mar 29 '17 at 12:25
[This](https://lwn.net/Articles/586838/) is a very interesting read on the topic. — Henry Gomersall, Mar 29 '17 at 12:29
@BoPersson Without an explicit memory fence, can I say anything about the processor not doing out-of-order execution of the stores (thereby breaking the ordering requirement)? It seems to be that `volatile` puts no hard restriction on the out-of-order execution engine of the CPU (through e.g. a memory fence). — Henry Gomersall, Mar 29 '17 at 12:37
If you're talking specifically about [`volatile`](http://en.cppreference.com/w/c/language/volatile), then that only provides guarantees within a single thread, and only with regards to (preventing) eliding and re-ordering of accesses to a `volatile` relative to other visible side effects. If you're talking about [`memory_order_seq_cst`](http://en.cppreference.com/w/c/atomic/memory_order), then that guarantees a single total order of modifications across all threads. They're really two very different things. — Sander De Dycker, Mar 29 '17 at 12:52
I'm assuming your other thread does something like `if( b == 30 ){ destroy(a) }`? — 2501, Mar 29 '17 at 13:42
@BoPersson The problem isn't the compiler, but the processor. It is perfectly plausible that any reads or writes will happen out of order, which could cause problems. This is why a memory fence is needed. — Henry Gomersall, Mar 29 '17 at 15:17
@BoPersson Well, regardless of the spec, there is no memory barrier introduced in the assembly generated from a write to a volatile, so I suggest the compiler is not handling anything related to out-of-order execution. [This page](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0321a/BIHGIEDG.html) describes why a memory barrier is necessary (as applicable in my case). — Henry Gomersall, Mar 29 '17 at 17:20

score 2 · Answer 1 · answered Mar 29 '17 at 19:20

2

Two memory accesses in the same thread are always sequenced, if they happen, and if there is a sequence point between them.

The part for "if they happen" is guaranteed here if the two objects are declared volatile. This forces the compiler to effectively emit the load or store to memory. How the compiler does this, and how he gives guarantees for that, is completely implementation dependent.Read your platform documentation for that.

The sequencing of statements has not much to do with volatile or atomics. It is implied by syntax. A good rule of thumb is that there is a sequence point at each ;, ,, {, }, ?, || and &&. (there are more than that, but things become complicated if you want to reason with them).

Nothing of that is about atomics. These are to guarantee indivisibility of operations and data consistency inbetween threads and with signal handlers. The big deal here is to have provable visibility of side effects of operations. This is relatively involved, but doesn't help you anything when you want to discuss things that happen in the same thread. In the contrary, the "happens before" relation between threads, relies on the "sequenced before" relation within the individual threads.

answered Mar 29 '17 at 19:20

Jens Gustedt

76,821
6
102
177

To be clear, you mean sequenced in the assembly output? Not necessarily assembled so the sequence is obeyed by the processor? – Henry Gomersall Mar 30 '17 at 06:55
Just further to the last comment, surely to have well ordered side effects is to imply some kind of memory barrier? At the very least, there can be no side effects from subsequent memory accesses until the first have happened. – Henry Gomersall Mar 30 '17 at 06:59
@HenryGomersall, all rules in C are "as if". That is all implementations have to behave such that all observations are consistent with that model. If your architecture needs a barrier to avoid reordering by the processor *and* such a reordering could be observed by a C program, your compiler has to provide such a barrier. – Jens Gustedt Mar 30 '17 at 07:04
The problem is, this isn't consistent with what is actually seen in the output assembly, in the context of [this discussion](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0321a/BIHGIEDG.html). I would expect, say, some code that writes a volatile register and then reads a different volatile register to need in the general case a memory fence when on an ARM that supports out-of-order execution. However, this is not what is seen in the GCC output. – Henry Gomersall Mar 30 '17 at 08:42
@JensGustedt : it's a bit more complicated than that. The standard says that "_accesses to `volatile` objects are evaluated strictly according to the rules of the abstract machine_" (5.1.2.3), which makes these accesses part of the observable behavior, and hence subject to the as-if rule. But it also says that "_what constitutes an access to an object that has `volatile`-qualified type is implementation-defined_" (6.7.3). So, it's all really up to the implementation to define what `volatile` means exactly, so you need to refer to the implementation's documentation. – Sander De Dycker Mar 31 '17 at 09:46

Ordering of atomics with `memory_order_seq_cst`

1 Answers1