3

From the Procedure Call Standard for the ARM architecture (§7.1.5):

a compiler may ignore a volatile qualification of an automatic variable whose address is never taken unless the function calls setjmp().

Is that mean that in following code:

volatile int x = 8;
if (x == 1)
{
    printf("can be optimised away??");
}

The whole if scope can be optimised out?

This just contradict the standard, for starters, volatile accesses are part of the observable behaviour and must be performed as in the abstract machine code:

§5.1.2.3:

The least requirements on a conforming implementation are:

Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.

And also §6.7.3:

An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine

Is there a contradiction? And if so, how is it legit that a PCS contradict the C standard?

izac89
  • 3,790
  • 7
  • 30
  • 46
  • 1
    Why is it contradicting the standard? If no one is observing `x` through a pointer, then why does it still need to be treated as `volatile`? The C standard says "At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet occurred." This looks like it can be met given the constraints set by the PCS. – John Szakmeister Sep 14 '19 at 09:17
  • 1
    `The least requirements on a conforming implementation are: Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine...` can it get simpler than that? also `An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine,` – izac89 Sep 14 '19 at 09:20
  • How could `x` be modified in a way unknown to the compiler? – John Szakmeister Sep 14 '19 at 09:25
  • FWIW, clang and gcc will optimize it away on an Intel platform too. – John Szakmeister Sep 14 '19 at 09:27
  • 1
    I cannot find any compiler in godbolt that optimize it. Examples- https://godbolt.org/z/DlA4vs , https://godbolt.org/z/yUU6O9 – izac89 Sep 14 '19 at 09:30
  • Hrmph... you're right... I must have missed it when looking at the disassembly--shame on me. :-( Either way, the assumption here is that you don't really know where `x` is being allocated, so without some way of obtaining a pointer to it, no one outside the routine can modify it. I do believe it's a safe assumption to make. Hopefully one of the compiler/standards gurus on SO can jump in and say more. – John Szakmeister Sep 14 '19 at 09:40
  • There is a similar question to yours here: https://stackoverflow.com/questions/51472394/is-it-allowed-for-a-compiler-to-optimize-away-a-local-volatile-variable Both answers are present. :-) – John Szakmeister Sep 14 '19 at 09:46
  • this is not a dup at all, here there is a contradiction between ARM APCS and C standard, and this question is regarding C. your given question is about C++ (hence different standard, no such thing as glvalue in C) and different code, and any ways both the the provided assembly some answers there says it cannot be optimized, hence my question is still unanswered. – izac89 Sep 14 '19 at 09:48
  • I wasn't calling it a dupe--I mentioned it as point of interest, not as a potential answer. I would have voted to close the question if I thought that. And there is an answer claiming the other direction: https://stackoverflow.com/a/51488739/683080. Honestly, I think it best to ask of the open source compiler development groups. The LLVM folks are good guys and could shed more insight on it. I doubt many of those folks hang out on SO, so asking on one of their lists would be a better way to get an expert answer, IHMO. – John Szakmeister Sep 14 '19 at 09:56
  • 1
    "An actual implementation need __not evaluate__ part of an expression if it can deduce that its value is not used and that no needed side effects are produced (__including__ any caused by calling a function or __accessing a volatile object__)" [C11 5.1.2.3p4](https://port70.net/~nsz/c/c11/n1570.html#5.1.2.3p4) – KamilCuk Sep 14 '19 at 10:57
  • 3
    `How could x be modified in a way unknown to the compiler?` <- by an ISR or memory-mapped hardware etc. The compiler is not aware of how the HW works. This happens a lot in embedded software. – Morten Jensen Sep 14 '19 at 10:57
  • @KamilCuk this is not the case. A case that falls under your citation is like `void func(void) { int x; x = *(volatile int*)ADDRESS; return; }` here the side effect (of the assignment) of a volatile access is not used. hence may not be evaluated. Yet, the volatile access still MUST happen. (example: https://godbolt.org/z/gE7RPf) – izac89 Sep 14 '19 at 11:01
  • In your example `ADDRESS` is an address, not a volatile object. Right? I think the "volatile object" is a variable that is explicitly declared with "volatile". Or not? No no, that the current implementatino does not adhere to the standard doesn't mean anything. Also the rule is "_may_ ignore", not "must ignore". I don't believe I know an implementation that will optimize `if (x == 1)` away. – KamilCuk Sep 14 '19 at 11:04
  • so? still it's example that meets you citation, but the code in the question does not. – izac89 Sep 14 '19 at 11:05
  • take a look at this: https://godbolt.org/z/cEObu3. The code: `void func() { volatile int x; int y; y = x; }` the access to `x` must be performed. the `y` side-effect can be removed – izac89 Sep 14 '19 at 11:06
  • @KamilCuk take also look at https://stackoverflow.com/questions/55534512/optimization-allowed-on-volatile-objects please note the difference between the code in this question and the code example in the answer in this thread (which can never be true) – izac89 Sep 14 '19 at 11:10
  • `the access to x must be performed` - From your example, it's undefined to access `x`, it's uninitialized. Anyway, with intialization `x = 0`, it makes no real difference in generated assembly. – KamilCuk Sep 14 '19 at 11:15
  • yes, obviously I forget to initialize it – izac89 Sep 14 '19 at 11:17
  • 1
    @KamilCuk: If `x` is volatile, the compiler cannot know it has not been changed by some means, so the fact it is not initialized explicitly by the C code does not mean it has not been assigned a value, so reading it is not undefined behavior. Also “access” in the C standard means reading or writing, so “accessing” an uninitialized object is not necessarily undefined behavior even for non-volatile objects. – Eric Postpischil Sep 14 '19 at 11:30
  • @JohnSzakmeister: it seems that [neither gcc nor clang will optimize this away](https://godbolt.org/z/VyMIrb). I guess you could also theoretically have a higher priority ISR which writes over the stack, i.e. the memory area which holds automatic variables, so this is one of those weird low level scenarios which C is supposed to handle. – vgru Sep 14 '19 at 11:53
  • 2
    I think the key is in "Accesses to volatile objects are evaluated strictly according to the rules of the _abstract machine_." _ARM procedure call standard_ applies to not any _abstract machine_, just _ARM abstract machines_ and so can optimize out the code as it _knows_ `x` is not accessed within its world of _abstract machines_. – chux - Reinstate Monica Sep 14 '19 at 13:05
  • @EricPostpischil I was following [6.3.2.1p2](https://port70.net/~nsz/c/c11/n1570.html#6.3.2.1p2): `If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.` (the behavior is conversion to lvalue). It doesn't mention volatile. `x` here is `auto` and doesn't has it address taken, so I assumed taking it's lvalue has to be undefined. – KamilCuk Sep 14 '19 at 13:09
  • @KamilCuk: Since the object is volatile, an assignment to it may have been performed prior to use, without the compiler’s knowledge. – Eric Postpischil Sep 14 '19 at 13:16
  • @MortenJensen Sorry, the PCS specifically stated this is an automatic variable--which means it's function scope and not a global variable. The memory space containing it is on the stack not backed by a special device or anything like that. You'd have to be poking into things you really shouldn't to do this and I would say that is architecturally a bad idea--so bad that a compiler shouldn't have to account for that kind of shenanigans. Note: it also says if you take the address of the variable, then you can't make this optimization. – John Szakmeister Sep 14 '19 at 14:03
  • @Groo Yep, I backed away from what I said earlier about gcc and clang... I misread the disassembly. As for the other bits, I think that's a stretch at best and fundamentally broken idea at worst. ARM does define quite a bit in terms of how things should operate, from top to bottom across a variety of documents. This document isn't in a vacuum, it's assuming that you're following the other conventions set forth. Again, I think a better forum for this would be one of the compiler mailing lists--where the compiler experts hang out. – John Szakmeister Sep 14 '19 at 14:12
  • @chux the standard defines the abstract machine like this `In the abstract machine, all expressions are evaluated as specified by the semantics`. So, when the standard says volatiles should be evaluated strictly according to the abstract machine, it means volatiles should be evaluated strictly `as specified by the semantic`, regardless of which machine runs it (ARM, Intel, etc.) – izac89 Sep 17 '19 at 05:24

1 Answers1

0

So I reached ARM toolchain's support group, and according to them the ARM PCS standard is an independent standard that is not bound to the C standard, such that a compiler can choose to comply to one, or both of them. In their own words:

In a way it's not really a contradiction

  • the APCS permits a compiler to respect or ignore local volatile
  • the C standard requires a compiler to respect local volatile

so a compiler that is compatible with both will respect local volatile.

Armclang has elected to follow the C standard which makes it compatible with both

So if a compiler choose to perform this non C-conforming optimization, it is still ARM PCS conforming implementation, but not a C-conforming compiler.

To conclude, a C-conforming compiler for ARM architecture which implements ARM PCS will never perform this optimization.

Community
  • 1
  • 1
izac89
  • 3,790
  • 7
  • 30
  • 46
  • Then what guarantees are provided for volatile qualified automatic objects? – curiousguy Oct 19 '19 at 22:04
  • ARM ABI promise nothing. if the compiler is C standard compliant, it must follow C standard guarantees. The ABI permit automatic volatile qualifier ignoring which a non C standard compiler can perform. – izac89 Oct 20 '19 at 04:43