C++11 atomics: does it make sense, or is it even possible, to use them with memory mapped I/O?

Question

As I understand it, C volatile and optionally inline asm for memory fence have been used for implementing a device driver on top of memory mapped I/O. Several examples can be found in Linux kernel.

If we forget about the risk of uncaught exceptions (if any,) does it make sense to replace them with C++11 atomics? Or, is it possible at all?

Some people, including Mr. Linus Torvalds, see exception in C++ as one of its most critical defect, especially for writing kernel code. I just wanted to clarify such a debate was out of the scope of my question. — nodakai, Feb 24 '16 at 07:43
We might have to make an engineering decision to give up using `std::atomic` for memory mapped I/O which is typically used in kernel space if (1) at least one of its methods (indirectly) throws exceptions and (2) we don't have a good way to manage it in (a particular piece of) the kernel code. We have to be especially careful with it on a monolithic kernel like Linux. Anyways it is out the scope of my question. — nodakai, Feb 24 '16 at 08:38
I see. To the best of my knowledge, all methods have the `noexcept(true)` specification, so they are not allowed to let any exceptions escape. In theory that doesn't prevent them from using exceptions internally, but I can't think of a reason, why an actual implementation would want to do this (of course you would have to check the documentation / code of your standard library implementation to be sure). — MikeMB, Feb 24 '16 at 09:00

VolAnd · Answer 1 · 2016-02-24T06:18:43.990

2

As I understand for reading references std::atomic is designed to manage multi-threading access to memory (concurrency, and so on). But as I know, as well as you said, volatile is designed for things like memory mapped I/O and signal handling. So, volatile has no effect on atomic access and not resolve multi-threading access issues like atomics if used alone. And vice versa - atomics do not provide features of volatile.

Thus, the short answer to your question is NO.

edited Feb 24 '16 at 06:18

answered Feb 24 '16 at 06:03

VolAnd

6,367
3
25
43

"And vice versa - atomics do not provide features of volatile." You didn't provide any reasons backing up this statement. Also, please recall that, when we talk about `std::atomic` or `volatile`, 99.99 % of their usages are limited to CPU's native integers which are naturally atomic (if you have any non-toy counterexamples, I'd be interested in it.) – nodakai Feb 24 '16 at 07:45
1

@nodakai - `std::atomic` is not only about partial read/write of a variable, but also about cache coherency in multi-CPU systems. Your `volatile` memory mapped I/O is likely in a non-cached memory area, so totally different. – Bo Persson Feb 24 '16 at 09:13
@BoPersson Does MMIO not being cached *invalidate* any assumptions of `std::atomic`? Are you talking about inefficiency? How about using `std::memory_order_relaxed` then? – nodakai Feb 24 '16 at 09:59
@nodakai Also look this post http://stackoverflow.com/questions/8819095/concurrency-atomic-and-volatile-in-c11-memory-model – VolAnd Feb 24 '16 at 11:54
@VolAnd It's easy to post a link to a "related" Q&A when the website is equipped with search function, but it doesn't immediately add a value to your answer. What property of `std::atomic` makes it unusable for MMIO? Obviously issuing a memory barrier (which might accompany `std::atomic`) won't corrupt it in any ways. (@MikeMB and @marko already raised a point.) You can claim [all the non-trivial zeros of Riemann zeta function lie on the critical line](http://mathworld.wolfram.com/RiemannZetaFunctionZeros.html) and that might be true. But it must be backed up by a valid reasoning. – nodakai Feb 24 '16 at 12:17
@nodakai Of course, link doesn't add a value to your answer. That's why I write in the comments (not in the answer) – VolAnd Feb 24 '16 at 12:40

MikeMB · Accepted Answer · 2016-06-13T13:59:48.403

In general, you can replace memory fences with atomics, but not volatile, except where it is used together with a fence exclusively for inter thread communication.

Whith regard to memory mapped I/O the reason atomics don't suffice is that:

volatile guarantees you that all memory accesses to that variable in your program do actuall happen and that they happen (whithin a single thread) exactly in the order you specify.
std::atomic only guarantees that your program will behave as if all those memory accesses happen (according to C++'s memory model, which doesn't know about memory mapped I/O) and - depending on the specified memory ordering - as if they happen in the specified order.

In practical terms that means, that the compiler can e.g. replace consecutive writes to the same (non-volatile) atomic with a single write (if there is no other synchronization in between) and the same is true for reads. If the result of the read is not used, it could even eliminate the read completely (the compiler might still have to issue a memory barrier though).

On a more theoretical level, if your compiler can prove that all your program does is returning 42, then it is allowed to transform this into a single instruction independently of how many threads and atomics your program uses in the process. If your program uses volatile variables that is not the case.

EDIT: E.g. This paper shows a few posssible (and probably unexpected) optimizations the compiler is allowed to apply to an atomic loop variable.

Making no assumptions about the memory having conventional load/store semantics - and not having other side-effects. In many memory mapped I/O devices reads and writes have side effects (for instance, reading from a register de-asserts and interrupt line, or resets other registers). — marko, Feb 24 '16 at 09:16
@MikeMB @marko I see, so while it makes sense to have a `void`-returning function whose body is a single `volatile` read from MMI/O, when we replace it with `std::atomic::load`, even with `std::memory_order_seq_cst`, it can theoretically result in a genuine NOP --- is this correct? — nodakai, Feb 24 '16 at 09:54
@nodakai: Almost. It will still issue a memory fence (on ×86 a compiler fence might be enough) , but yes, the actual load could be optimized away. — MikeMB, Feb 24 '16 at 09:59
@MikeMB Points taken, barriers, if any, won't be optimized away — nodakai, Feb 24 '16 at 10:25

C++11 atomics: does it make sense, or is it even possible, to use them with memory mapped I/O?

2 Answers2

Linked