7

While writing an answer regarding how compilers must treat volatile, I believe I may have stumbled upon a gcc bug and would like someone to verify before I report it.

I wrote up a simple function such as this:

int foo (int a, int b, int c)
{
  b = a + 1;
  c = b + 1;
  a = c + 1;
  return a;
}

Without optimizations this results in a lot of pointless moving of data back and forth. With optimizations the compiler just grabs the register where a was stored, then adds 3 and returns that result. To speak x86 lea eax, [rdi+3] and ret. This is expected, so far so good.

To demonstrate sequencing and volatile access, I changed the example to this:

int foo (int a, int b, int c)
{
  b = a + 1;
  c = *(volatile int*)&b + 1;
  a = c + 1;
  return a;
}

Here there's a lvalue access of the contents of b that is volatile qualified and as far as I can tell, the compiler is absolutely not allowed to optimize away that access1). From gcc 4.1.2 (and probably earlier) to gcc 10.3 I get conforming behavior (same in clang). The x86 machine code looks like this even with -O3:

foo:
        add     edi, 1
        mov     DWORD PTR [rsp-4], edi
        mov     eax, DWORD PTR [rsp-4]
        add     eax, 2
        ret

Then I try the same on gcc 11.1 and beyond, now I get:

foo:
        lea     eax, [rdi+3]
        ret

https://godbolt.org/z/e5x74z3Kb

ARM gcc 11.1 does something similar.

Is this a compiler bug?


1) References: ISO/IEC 9899:2018 5.1.2.3, particularly §2, §4 and §6.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • I think the pointer is considered volatile and its value is kept out of optimisations, but the memory it points to is not. – sorush-r Dec 16 '21 at 14:32
  • @sorush-r It doesn't really matter. I'm telling the compiler "you must read this variable from memory here" and it doesn't. Suppose I have some reason for it, like for example dummy reading a variable on the heap to ensure that the heap allocation is carried out _now_ and not later on when I use it for the first time. There are many ways that a volatile access side effect can affect the program. – Lundin Dec 16 '21 at 14:38
  • I also tried this `uintptr_t x = (uintptr_t)&b; c = *(volatile int*)x + 1;` but it gets optimized away too. – Lundin Dec 16 '21 at 14:42
  • 1
    @sorush-r: No, it's a pointer to `volatile int`. What you're describing would be `*(int *volatile)&b` and indeed lets the access optimize away even with older GCC like 9.4 that don't have the bug(?) described in this question. https://godbolt.org/z/bs31xveYK (the volatile-qualified pointer object result of the cast is never materialized anywhere, which is fine since it's only an rvalue) – Peter Cordes Dec 16 '21 at 14:42
  • 1
    @sorush-r `volatile int*` is a pointer *to* volatile data. – Eugene Sh. Dec 16 '21 at 14:42
  • 1
    Looks like compiler bug, similar to [this](https://stackoverflow.com/questions/55457835/arm-compiler-5-do-not-fully-respect-volatile-qualifier). In both cases it looks like the compiler feels free to assume automatic variables cannot be "volatile" (which is quite true, except the cases of debugged programs, where the variables can be changed under the runtime feet). – Eugene Sh. Dec 16 '21 at 14:44

3 Answers3

3

Per C18 5.1.2.3/6, accesses to volatile objects (strictly according to the rules of the abstract machine) are part of the observable behavior of the program, which all conforming implementations must reproduce. The term "access" in this context includes both reads and writes.

C18 5.1.2.3/2 and /4 reinforce that volatile accesses are needed side effects, excluded from the rule that implementations are allowed to avoid producing unneeded side effects.

The only out I see for GCC would be an argument that although (volatile int*)&b is an lvalue with volatile-qualified type, it can prove that the object it designates (b) is not actually a "volatile object", which indeed it is not if you go by its declaration. And that is consistent with GCC 11.2's observed behavior for this version of the function:

int foo (int a, int b, int c)
{
  volatile int bv = a + 1;
  c = bv + 1;
  a = c + 1;
  return a;
}

, which yields the same assembly as older versions of GCC do for the original code (godbolt).

Whether this constitutes a bug in the sense of non-conformance with the language standard is unclear, but certainly GCC is thwarting the apparent intent of the programmer.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • 1
    Revisiting this old question. Since the time of this answer [DR476](https://www.open-std.org/jtc1/sc22/wg14/www/docs/summary.htm#dr_476) has been implemented in C23. Meaning that "The only out I see for GCC" is no longer valid either, soon as C23 goes live. I ended up reporting this as a gcc bug and apparently the "volatile lvalue access" over "access of volatile object" feature was already supposed to be live in gcc since many years back. So there was potentially some hiccup during gcc 11 release. – Lundin Jan 05 '23 at 12:23
  • Thanks, @Lundin, that DR indeed speaks directly to the point. I'm glad to hear that GCC already accepts the observed behavior as erroneous, especially given that acceptance of the DR into C23 leaves no room for doubt. – John Bollinger Jan 05 '23 at 14:42
  • Well I only submitted it as a bug today, but at least one person agreed that it is a confirmed bug. Either way they'll have to implement the DR476 "volatile semantics for lvalues" for the C23 release later this year. – Lundin Jan 05 '23 at 15:02
2

Passing the address to a non-inline function makes GCC respect volatile casts for later accesses (and maybe earlier, didn't check) to a function arg or local. https://godbolt.org/z/cssveev7n

I duplicated the c = line and the asm contains two loads of b thanks to the volatile cast, using GCC trunk.

void bar(void*);
int foo (int a, int b, int c)
{
  bar(&b);              // b's address has now "escaped" - potentially globally visible
  b = a + 1;

  c = *(volatile int*)&b + 1;
  c = *(volatile int*)&b + 1;   // both accesses present.
  a = c + 1;
  return a;
}
# GCC trunk -O3 -fverbose-asm
        call    bar     #
        mov     DWORD PTR [rsp+12], ebx   # b, tmp89
        mov     eax, DWORD PTR [rsp+12]   # _2, MEM[(volatile int *)&b]
        mov     eax, DWORD PTR [rsp+12]   # _3, MEM[(volatile int *)&b]
 ... 
        add     eax, 2
        ret

So this seems innocent except maybe in some microbenchmark use-cases; it's not going to break hand-rolled atomics using casts like these, such as the Linux kernel's READ_ONCE / WRITE_ONCE macros.

Still arguably violating ISO C rules, if it's legal to alias a plain int with a volatile int. If not, it's only GCC defining behaviour, so it's up to GCC. I post this more as a data point than an argument in either direction on that aspect of the question.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • The `bar` call will prevent the optimization where it is just grabbing (the register containing) `a`, adding 3 and returning the result, so it has to respect the volatile access simply because it can no longer reason about the contents of `b`. Still, in your example you do get the extra expected `mov` instruction, which doesn't happen if modified to `c = *(int*)&b + 1;`. – Lundin Dec 16 '21 at 15:04
  • @Lundin: Right, the no-cast code is more complicated, too, but my first thought was "this bug will break hand-rolled atomics", and it turns out it doesn't. In any (or at least this) case where the value is potentially visible to other threads, `volatile` is respected. I'm curious if it was an intentional GCC change to disregard `volatile` on accesses to known-private automatic storage. – Peter Cordes Dec 16 '21 at 15:08
  • @Lundin: It is theoretically problematic for microbenchmarks that use a volatile *cast* (instead of a volatile var like `volatile int c`) or inline asm to force compilers to materialize a value being assigned. It seems it applies to writes as well as reads: https://godbolt.org/z/sK5bofWeq shows `*(volatile int*)&c = ...` twice not stopping full optimization into registers, with the function call commented out. But things are fine with `volatile int c` like most low-effort microbenchmarks use, if they don't use inline asm, so it's probably fine and good in case of READ_ONCE on locals. – Peter Cordes Dec 16 '21 at 15:13
  • In embedded systems you often do volatile reads like this to enforce a "dummy access", although in such scenarios the target variable is almost always a `volatile` qualified one. – Lundin Dec 16 '21 at 15:31
0

I heard a compiler team argue convincingly ( ok, I nearly fell asleep, so I got a rough outline ) that outside of an externally scoped word sized object, volatile was a meaningless decoration. Further the compiler provided some sort of traditional behaviour surrounding meaninglessly attributed objects as a convenience to people working with legacy code. This interpretation was based on an absurd reduction of the C standard which is better than correct, it is technically correct, the gold standard of alpha-geeks.

mevets
  • 10,070
  • 1
  • 21
  • 33
  • Then how can you explain this: https://godbolt.org/z/bKfTqdGzr. I made one of the local variables `volatile` and now even gcc 11 makes an attempt to respect that qualifier, while still performing an optimization `add edi, 2`. – Lundin Dec 16 '21 at 15:38
  • I said nothing prescriptive; I merely obliquely parroted the opinion of the developers of one of the top 10 compiler providers. Standards interpretation should be by prescription only; and you should certainly wait hours before operating heavy machinery. – mevets Dec 16 '21 at 15:46
  • 1
    *volatile was a meaningless decoration* Given that `volatile` turns all accesses of the variable by the code as-written into observable side effects (by definition), a compiler team of all groups characterizing it as a "meaningless decoration" is more than a little scary. I have no idea how hand-waving away required side effects could be "technically correct". – Andrew Henle Dec 16 '21 at 15:53
  • 1
    In the OPs example, the function argument is not observable. Since it is not observable there is no contract. An automatic definition is only observable if its address has been recorded. Note that `y = *&x` doesn't record x's address; but *&y = x` might. The nice thing about standards is that there are so many interpretations of them. – mevets Dec 17 '21 at 23:36