2

GCC does not seem to be able to trace and optimize programs that read/write global variables in C/C++, even if they're static, which should allow it to guarantee that other compilation units won't change the variable.

When compiling the code

static int test = 0;

int abc() {
  test++;
  if (test > 100) \
    return 123;
  --test;
  return 1;
}

int main() {
  return abc();
}

with the flags -Os (to produce shorter and more readable assembly) and -fwhole-program or -flto using GCC version 11.2 I would expect this to be optimized to return 1, or the following assembly:

main:
        mov     eax, 1
        ret

This is in fact what is produced if test is a local variable. However, the following is produced instead:

main:
        mov     eax, DWORD PTR test[rip]
        mov     r8d, 1
        inc     eax
        cmp     eax, 100
        jle     .L1
        mov     DWORD PTR test[rip], eax
        mov     r8d, 123
.L1:
        mov     eax, r8d
        ret

Example: https://godbolt.org/z/xzrPjanjd

This happens with both GGC and Clang, and every other compiler I tried. I would expect modern compilers to be able to trace the flow of the program and remove the check. Is there something I'm not considering that may allow something external to the program to affect the variable, or is this just not implemented in any compilers yet?

Related: Why gcc isn't optimizing the global variable? but the answer given there mentions external functions and threads, neither of which apply here

Heath Mitchell
  • 372
  • 3
  • 13
  • 1
    There is difference between `constant` & `variable`. `test` here is not a constant. Check with `if (100 > 100)` once. – जलजनक Apr 04 '22 at 19:12
  • 3
    Also, I _think_ that with `extern int test;` another compilation unit can still modify the variable. – Mooing Duck Apr 04 '22 at 19:13
  • @SparKot I know it's not a constant, but `return 123` is still unreachable, which I would think GCC would be able to know considering that the initial value of `test`, and all places where it could be written to, are known. It works with local variables. – Heath Mitchell Apr 04 '22 at 19:14
  • @MooingDuck I don't have `extern int test` though, unless that would somehow work from a different file? Either way I do have `-fwhole-program` on so it should be able to assume that it doesn't happen, shouldn't it? – Heath Mitchell Apr 04 '22 at 19:16
  • @SupportUkraine So is this a potential optimization that compilers just can't do at the moment? – Heath Mitchell Apr 04 '22 at 19:19
  • 1
    You can even create smaller examples of this. Even if the only statement in `main` is to modify the `test` variable, it is not optimized away: https://godbolt.org/z/ano76fe4T – Jakob Stark Apr 04 '22 at 19:26
  • @SupportUkraine What I'm trying to ask is this: Is there some kind of attribute I can set that will let the compiler optimize this, or would it need to be done manually? This is a reduced version of a real example using the `wasm2c` tool, where it will increment a counter before a function, check if it exceeds a value, then decrement it again after. You can see it here: https://github.com/WebAssembly/wabt/blob/main/wasm2c/examples/fac/fac.c#L9 This messes up some optimisations so I would like to automatically remove it – Heath Mitchell Apr 04 '22 at 19:31
  • If you make the function `static int abc(void)`, does anything change? – Jonathan Leffler Apr 04 '22 at 19:40
  • @JonathanLeffler No, that doesn't change anything :( – Heath Mitchell Apr 04 '22 at 19:42
  • OK. Obviously, making the function `static` tells the compiler that the only code using `abc()` is in this file, which opens the door to inline optimization. Have you tried `inline` too (when the function is also `static`)? It probably isn't going to alter things. Have you tried alternative optimization options (`-O3`, etc)? Are you sure you're not worrying about minuscule optimizations — in what bigger context would it make a measurable difference? – Jonathan Leffler Apr 04 '22 at 19:45
  • @MooingDuck I do hope not. – Paul Sanders Apr 04 '22 at 21:50
  • One can imagine a hack where you have an apparently unchanging `static` variable that controls some program behavior, but then you patch the executable after compilation to initialize it to some other value. There's all kinds of problems with this, but it could be that such code is out there and gcc wants to avoid breaking it. Just a wild guess. – Nate Eldredge Apr 05 '22 at 06:47
  • @NateEldredge Now that's _proper_ programming :) – Paul Sanders Apr 05 '22 at 07:41
  • 1
    @MooingDuck: No, `static int foo;` can't be accessed from another compilation unit. `extern int foo;` is how other CUs need to access an `int foo;` global in C (they can't all just declare `int foo;` except with `gcc -fcommon` the old default). Another CU could call into `abc()` (e.g. from another thread it started in an init function that ran before `main`). But I think any argument for that runs into UB, so this is just an obscure missed optimization. Most functions that modify a variable do *not* provably return it to its original value every time, so it's not something worth looking for. – Peter Cordes Apr 05 '22 at 07:56
  • "Are you sure you're not worrying about minuscule optimizations" - I probably am. but I'm also just curious as to why this happens – Heath Mitchell Apr 05 '22 at 08:32
  • @NateEldredge: GCC will optimize away static vars that really are only read; it's the modification of `test` here that stops GCC from noticing. (Because it doesn't bother trying to prove that modifications never actually change the value; presumably that's rare in real-world functions, and/or expensive to try to look for.) Interestingly, adding an `if (test == 0) return 1;` early-out with the initial value (or even a couple higher) lets clang optimize it away: https://godbolt.org/z/vnx6z7G53 – Peter Cordes Apr 05 '22 at 09:19

1 Answers1

5

I think you are asking a little bit too much for most of the compilers. While the compiler is probably allowed to optimize the static variable away according to the as-if rule in the standard, it is apparently not implemented in many compilers like you stated for GCC and Clang.

Two reasons I could think of are:

  • In your example obviously the link time optimization decided to inline the abc function, but did not optimize away the test variable. For that, an analysis of the read/write semantics of the test variable would be needed. This is very complex to do in a generic way. It might be possible in the simple case that you provided, but anything more complex would be really difficult.

  • The use case of such optimizations is rare. Global variables are most often used to represent some shared global state. I makes no sense to optimize that away. The effort for implementing such a feature in a compiler/linker would be large compared to the benefit for most programs.

Addition
Apparently GCC optimizes away the variable if you read-only access it. If you compile the following:

static int test = 0;

int abc() {
  int test_ = test;
  test_++;
  if (test_ > 100) \
    return 123;
  --test_;
  return 1;
}

int main() {
  return abc();
}

Where you read the variable once into a local variable and never write to it, it gets optimized away to:

main:
    mov     eax, 1
    ret

(See here for a demo)
However using such a local variable would defeat the whole point of having a global variable. If you never write to it, you might as well define a constant.

Jakob Stark
  • 3,346
  • 6
  • 22
  • 1
    _The optimization that you are asking for takes place during the linking step_ Are you sure about that? – Paul Sanders Apr 04 '22 at 21:52
  • 2
    There are other optimizations involving `static` that are done at compile time, not at link time. As the simplest example, a `static` function or variable that is never used will be deleted at compilation and never even make it to the linker. So if this optimization was going to be done, I would expect it to be done by the compiler, not the linker. – Nate Eldredge Apr 05 '22 at 06:42
  • @PaulSanders no I am not so sure now that you mentioned it. I will edit the answer – Jakob Stark Apr 05 '22 at 07:34
  • Fun fact: Adding an `if (test == 0) return 1;` early-out in `abc()` with the initial value (or even `test == 1` or 2 while still zero-initializing it) lets clang optimize it away: https://godbolt.org/z/vnx6z7G53 . So it is just a GCC missed-optimization (which makes sense not to look for especially without the early-out). Unless they want to support a library `.init` hook function starting multiple threads and using `PTRACE_SINGLESTEP` on each one to run the increment but not the decrement. Sinec that's still data-race UB, I don't think GCC is intentionally supporting that. – Peter Cordes Apr 05 '22 at 09:25