How to force an unused memory read in C that won't be optimized away?

Question

Microcontrollers often require a register to be read to clear certain status conditions. Is there a portable way in C to ensure that a read is not optimized away if the data is not used? Is it sufficient that the pointer to the memory mapped register is declared as volatile? In other words, would the following always work on standard compliant compilers?

void func(void)
{
   volatile unsigned int *REGISTER = (volatile unsigned int *) 0x12345678;

   *REGISTER;
}

_{I understand that dealing with functionality like this runs into compiler-dependent issues. So, my definition of portable is a bit loose in this case. I just mean that it would work as widely as possible with the most popular toolchains.}

Inline assembly is not usually optimized. Unless you want it crossplatform, it's fine. — Dmytro Sirenko, Dec 11 '12 at 16:06
@EarlGray That's a very valid technique. However, and this is nitpicky, I have an example where that would require an extra instruction because the compiler could otherwise optimize out an address load by using a relative read from an already loaded address. Of course, I did specify portability as the goal and not performance... — Judge Maygarden, Dec 11 '12 at 16:16
@Judge: FYI, there's a further concern about the difference between accessing a volatile *object*, and accessing an object using a volatile *lvalue expression* (http://stackoverflow.com/questions/13268657). That question is about C++, but the same nitpickery might well apply to C too. — Steve Jessop, Dec 11 '12 at 16:56
I call a function written in assembly GET32(addr) ldr r0,[r0]; bx lr and never have problems. Is more portable than inline, slightly slower of course, but has other advantages. — old_timer, Dec 11 '12 at 22:52

zwol · Accepted Answer · 2012-12-12T16:48:53.207

10

People argue quite strenuously about exactly what volatile means. I think most people agree that the construct you show was intended to do what you want, but there is no general agreement that the language in the C standard actually guarantees it as of C99. (The situation may have been improved in C2011; I haven't read that yet.)

A nonstandard, but fairly widely supported by embedded compilers, alternative that may be more likely to work is

void func(void)
{
  asm volatile ("" : : "r" (*(unsigned int *)0x12345678));
}

(The 'volatile' here appies to the 'asm' and means 'this may not be deleted even though it has no output operands. It is not necessary to put it on the pointer as well.)

The major remaining drawback of this construct is that you still have no guarantee that the compiler will generate a one-instruction memory read. With C2011, using _Atomic unsigned int might be sufficient, but in the absence of that feature, you pretty much have to write a real (nonempty) assembly insert yourself if you need that guarantee.

EDIT: Another wrinkle occurred to me this morning. If reading from the memory location has the side-effect of changing the value at that memory location, you need

void func(void)
{
  unsigned int *ptr = (unsigned int *)0x12345678;
  asm volatile ("" : "=m" (*ptr) : "r" (*ptr));
}

to prevent mis-optimization of other reads from that location. (To be 100% clear, this change will not change the assembly language generated for func itself, but may affect optimization of surrounding code, particularly if func is inlined.)

edited Dec 12 '12 at 16:48

answered Dec 11 '12 at 16:21

zwol

135,547
38
252
361

I'm not familiar with that construct. What would it look like if the address was in a pointer variable instead of a constant? – Judge Maygarden Dec 11 '12 at 16:24
@JudgeMaygarden `... : : "r" (*ptr)`. The parentheses are mandatory, but what's inside is just a regular old C expression. – zwol Dec 11 '12 at 16:29
Thanks. I just figured out that I was missing the parentheses around the pointer. – Judge Maygarden Dec 11 '12 at 16:29
That works. It also allows the address load from the pointer to be optimized out. I wouldn't have thought that the most reliable and portable version would use inline assembler. Setting the read parameter without an actual instruction is a neat trick! – Judge Maygarden Dec 11 '12 at 16:33
I wrapped this in a macro and it works the same way as my example using IAR EWARM. I'm considering switching to GCC in the future and want to shore up loose ends. I'll still need to test both methods in GCC. – Judge Maygarden Dec 11 '12 at 16:44
@Zack The wording regarding volatile objects was improved in C11 5.1.2.3/6. See my answer for a quote from the standard. – Lundin Dec 12 '12 at 15:27

Lundin · Answer 2 · 2012-12-12T15:29:09.753

5

Yes, the C standard guarantees that code accessing a volatile variable will not be optimized away.

C11 5.1.2.3/2

"Accessing a volatile object, " ... "are all side effects"

C11 5.1.2.3/4

"An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object)."

C11 5.1.2.3/6

"The least requirements on a conforming implementation are:

— Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine."

edited Dec 12 '12 at 15:29

answered Dec 12 '12 at 15:11

Lundin

195,001
40
254
396

This language was all in C99 as far as I recall. Does C2011 address the specific point of what qualifies as an "access"? That was implementation-defined in C99 and that was the point of dispute which rodrigo, Steve Jessop, and I reprised in the comments on rodrigo's answer. – zwol Dec 12 '12 at 16:45
@Zack No, the 3rd quote was not in C99, I checked before posting this. I haven't quoted the part which says that what counts as an access is impl.defined. – Lundin Dec 12 '12 at 18:46
1

My copy of C99 is at home and I'm not, but I am *certain* that the sentence "Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine" appeared somewhere in C99. – zwol Dec 12 '12 at 19:08
@Zack Ah yeah you are right, I found it at C99 6.7.3/6. `Therefore any expression referring to such an [volatile] object shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3`. It seems they merely moved that part up to 5.1.2.3 in C11. – Lundin Dec 12 '12 at 19:20

rodrigo · Answer 3 · 2012-12-11T17:27:30.410

2

IIRC, the C standard is a bit loose in the definition of use, so the *REGISTER is not necessarily interpreted as doing a read.

But the following should do:

int x = *REGISTER;

That is, the result of the memory reference has to be used somewhere. The x does not need to be volatile, however.

UPDATE: To avoid the warning of _unused variable you could do with a no-op function. A static and/or inline function should be optimized away without runtime penalty:

static /*inline*/ void no_op(int x)
{ }

no_op(*REGISTER);

UPDATE 2: I've just came up with a nicer function:

static unsigned int read(volatile unsigned int *addr)
{
    return *addr;
}

read(REGISTER);

Now, this function can be used both for read-and-use and for read-and-discard. 8-)

edited Dec 11 '12 at 17:27

answered Dec 11 '12 at 16:01

rodrigo

94,151
12
143
190

2

If `x` itself is not used, this may not be enough. – zwol Dec 11 '12 at 16:02
This would throw a warning for an unused variable. I don't allow warnings in released code, and that is a good one for catching bugs. This would require disabling that warning in functions that use this technique. – Judge Maygarden Dec 11 '12 at 16:04
I hate to say it, but ... a sufficiently aggressive compiler will optimize out your `no_op` function *and then optimize out the load to the now-unused `x`*. – zwol Dec 11 '12 at 16:10
@Zack: I think it is enough, the read is done, and cannot be optimized away because it is volatile. The fact that the l-value written to is not volatile should have no consequence. The problem without the assignment is that the standard does not specify that the volatile value is actually read at all. – rodrigo Dec 11 '12 at 16:12
@rodrigo It seems to me that any argument that applies to `*vol;` should apply to `x = *vol;` when `x` is unused. You may be right, though, it's been a very long time since I read the relevant part of the standard. – zwol Dec 11 '12 at 16:16
1

@Zack: Even if the compiler agressively optimizes the write, it cannot optimize the read, because it is volatile! The problem is that nowhere in the C specs says that `*x` means `read the value at x`; it just means an lvalue with the `x` address. It is obvious if you write `*x = 0`, it should not read `*x` before writing into it. – rodrigo Dec 11 '12 at 16:21
@rodrigo My disagreement with you is a little different: I disbelieve that the abstract machine being required to perform a "read" is conditional on the *textual presence* of an operator that consumes the value of the read. – zwol Dec 11 '12 at 16:21
Again, I could be wrong, it has been a very long time, but what I recall is that there's only necessarily a "read" if there is a *not provably dead* "write" to go with it. – zwol Dec 11 '12 at 16:27
@rodrigo: 6.8.3/2 of C99 says that in the statement `*x;`, the expression `*x` is evaluated for its side-effects. A volatile read is a side-effect (5.1.2.3/2), so I suppose the question is whether "evaluating" a full-expression means to determine its value (in which case it must do the read) or in the case of an lvalue whether it merely means to determine what object it designates (in which case it doesn't). I realise this is more or less a restatement of what you've already said, but it supports your claim that `i = *x;` is different from `*x;`. – Steve Jessop Dec 11 '12 at 16:36
@Zack: There is no such thing as dead writes in C. There are _observable_ side effects and not observable ones. In `x = *REGISTER` both read and write happen in the abstract machine specified by the language. The read is observable (because of `volatile`) while the write is not. So the write can be optimized away, while the read cannot: that is some times called the _as is_ rule. In `*REGISTER` however it is not clear that there is a read in the abstract machine. It may be interpreted as a read-and-discard operation or as a no-op. – rodrigo Dec 11 '12 at 16:40
@SteveJessop: Let's put it in another context: `*x = 0`, now obviously `*x` is evaluated as part of the full expression, and its side effects are done, but it should not be read. Some people argue that the memory read is done as part of the l-value to r-value conversion in `a = *x`. But the C standard doesn't say so, it is just and interpretation. Moreover, I just can't remember whether a void expression (`*x;`) is an l-value or an r-value... – rodrigo Dec 11 '12 at 16:43
@rodrigo: that may be part of the problem, that there's no such thing as an rvalue in C99. There is "the value of an expression", and there is "evaluating expressions" and on the left of an assignment evaluating the sub-expression clearly *doesn't* compute its value. With `a = *x` the value of the sub-expression `*x` clearly is calculated (and hence the read occurs as a side-effect), since `=` assigns it to `a`. Like you say, what matters is whether (to use C++ terminology) there is an lvalue-rvalue conversion at "top level" in the statement `*x;`. – Steve Jessop Dec 11 '12 at 16:46
Great conversation! I obviously need to think about this a bit more. After rereading the passage from the standard that Jens posted (and deleted) in light of Steve's comments, I'm not sure why the `*REGISTER;` method without assignment isn't perfectly valid... – Judge Maygarden Dec 11 '12 at 16:58
@JudgeMaygarden: Let me cite the [GCC Extensions documentation](http://gcc.gnu.org/onlinedocs/gcc/Volatiles.html): "The standard encourages compilers to refrain from optimizations concerning accesses to volatile objects, but leaves it implementation defined as to __what constitutes a volatile access__" (emphasis mine). – rodrigo Dec 11 '12 at 17:05
I just sent a link to this question to the another firmware engineer at the office, and the "empty asm" approach confused the hell out of them. So, that's one strike against it. ;) – Judge Maygarden Dec 11 '12 at 17:10
@rodrigo The access is implementation-defined as per the C standard 6.7.3/7. However, in real life, an access to a volatile object is always either a read or a write. I suppose some weird compiler could define the access as "just a write", but I don't think any such compiler exists, because it would render the volatile keyword useless. – Lundin Dec 12 '12 at 15:21
@Lundin There really have been compilers that optimized out `*x;`, usually as a side-effect of some other optimization that was actively desired. And I have personally witnessed quite epic flame wars over whether or not this counted as a bug. – zwol Dec 12 '12 at 16:52
1

@Zack Like in [this paper](http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf)? Interesting reading, none-the-less. – Lundin Dec 12 '12 at 18:49
@Lundin Yeah, only about five years prior. – zwol Dec 12 '12 at 19:07

score 1 · Answer 4 · answered Nov 05 '14 at 20:47

Perhaps GNU C specific extensions is not considered very portable, but here is another alternative.

#define read1(x)  \
({ \
  __typeof(x) * _addr = (volatile __typeof(x) *) &(x); \
  *_addr; \
})

This will translate to the following assembler line (compiled with gcc x86 and optimized with -O2) : movl SOME_REGISTER(%rip), %eax?

I get the same assembler from:

inline read2(volatile uint32_t *addr) 
{ 
   return *addr; 
}`

... as suggested in another answer, but read1() will handle different register sizes. Even though I'm not sure if usingread2() with 8 or 16-bit registers would ever be an issue, there are at least no warnings on parameter type.

score 0 · Answer 5 · answered Dec 11 '12 at 16:22

0

Compilers usually do not optimize assembly inlines (it's hard to analyze them properly). Moreover, it seems to be a proper solution: you want more explicit control over the registers and it's natural for assembly.

Since you're programming a microcontroller, I assume that there is some assembly already in your code, so a bit of inline assembly won't be a problem.

answered Dec 11 '12 at 16:22

Dmytro Sirenko

5,003
21
26

The OP asked "Is there a portable way...". So inline assembler is completely out of the question. – Lundin Dec 12 '12 at 15:23
@Lundin The OP says that "...my definition of portable is a bit loose in this case. I just mean that it would work as widely as possible with the most popular toolchains". My way is at least portable across toolchains (and the answer that was accepted acknowledges my opinion) – Dmytro Sirenko Dec 12 '12 at 15:31
Since inline assembler isn't covered by the C standard (unlike it is in C++), it wouldn't be portable between tools either. There is `asm NOP;` and `__asm NOP;` and `asm NOP` and `asm ("NOP")` and `asm { NOP; }` and so on... – Lundin Dec 12 '12 at 15:36

How to force an unused memory read in C that won't be optimized away?

5 Answers5

Linked