Why does -O3 in gcc seem to initialize my local variable to 0, while -O0 does not?

Question

From What happens to a declared, uninitialized variable in C? Does it have a value?, I tried playing with Ciro Santilli's code, shown below.

int f() {
    int i = 13;
    return i;
}

int g() {
    int i;
    return i;
}

int main() {
  assert(f() == 13);
  assert(g() == 0);
}

The call to g() should reuse the same address of the stack frame and the value of i in g(), although not initialized, should be 13. I have check the address of i in both functions and the address is the same.

However, it's 13 only when I use g++ -O0 but not with other level such as -O1 or -O3

I test this on both Windows 10 (gcc 8.1.0) and Ubuntu 18.04 (gcc 7.5.0). To be precise:

g++ -O3 -o test test.cpp -std=c++11

produce no assertion fail.

while g++ -O0 -o test test.cpp -std=c++11 gives Assertion `g() == 0' failed.

I understand that using i in g() falls into undefined behavior in the standard but it seems strange to me that other optimization level seems to go out of their way to change the value of i in g() from 13 back to 0. Am I missing something?

https://godbolt.org/z/bz3zd8Tbo With `-O3` it optimizes out the assert calls altogether. So you can't say that `g()` returns 0. In fact the function is optimized down to just a single `ret` instruction, so the return value is whatever happens to be in the `eax` register at the time. This is one thing that can happen when you hit undefined behavior — Kevin, Sep 01 '21 at 18:36
*Am I missing something???* You are missing that the C++ standard describes the behavior of a **C++ abstract machine**, and the compiler's job is to turn the code into something that honors the C++ abstract machine behavior, with the latitude to take advantage of the *as-if rule* and the "here be dragons" *undefined behavior* with its wickedly clever optimizer. The upshot is that C++ gives you enough rope to shoot yourself in the foot. As is done in the code here. — Eljay, Sep 01 '21 at 18:41
This is a good example of why it is usually an error to try to anticipate the result of Undefined Behavior. You can't reason about C++ code that contains Undefined Behavior, and you can't validate against Undefined Behavior at runtime. Ub can potentially break any test you put in place to detect it, such as the `assert` you used here. — François Andrieux, Sep 01 '21 at 18:42
I recommend the talk [Michael Spencer “My Little Optimizer: Undefined Behavior is Magic"](https://youtu.be/g7entxbQOCc). It's a bit detailed and specific to one compiler, but it gives definitions for what Undefined Behaviour is and means, and shows very concretely the strange things that compilers do to your code when UB is involved. — alter_igel, Sep 01 '21 at 18:42

Kevin · Accepted Answer · 2021-09-01T18:46:42.343

As eerorika's answer says, your code invokes undefined behavior.

If you actually look at the assembly code generated, you get this

f():
        mov     eax, 13
        ret
g():
        ret
main:
        xor     eax, eax
        ret

As you can see g() is a single ret instruction, compared to f() which sets eax to 13. So the return value of g() is whatever happens to be in the eax register at the time.

The reason why you think that g() returns 0 is that the assert doesn't fail. But that's because -O3 optimized out the assert calls altogether and essentially replaced the body of main with return 0;.

Edit: That was actually clang's output. Using gcc g() is compiled to this:

g():
        xor     eax, eax
        ret

so in that case it actually is return 0; :). I don't know why gcc does this, but you can't really reason about it since as mentioned before it is undefined behavior.

eerorika · Answer 2 · 2021-09-01T18:40:26.477

2

Default initialised int has an indeterminate value. If you read an indeterminate value (such as if you return it), then the behaviour of the program is undefined (there are exceptions but none that apply to your program).

The behaviour that you observed is explained by the behaviour being undefined.

it seems strange to me that other optimization level seems to go out of their way to

If this seems strange you, then I suspect that you don't understand what undefined behaviour means.

It is also unclear why you think that the compiler "seems to go out of their way".

edited Sep 01 '21 at 18:40

answered Sep 01 '21 at 18:29

eerorika

232,697
12
197
326

1

The "seems to go out of their way" part is discussed in detail here: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html (also applicable to C++) – Eljay Sep 01 '21 at 18:42

score 2 · Answer 3 · answered Sep 01 '21 at 18:42

@eerorika is correct that this is undefined behavior and anything can happen. Though I was at least able to reproduce your issue so I can at least provide a possible reason this is happening for your specific example.

At -O0, g compiles to (on x64 GCC 11.2):

g:
        push    rbp
        mov     rbp, rsp
        mov     eax, DWORD PTR [rbp-4]
        pop     rbp
        ret

The function is creating a local variable on the stack but doesn't assign it a value. It then returns this. You haven't specified what compiler or system you're using but you're likely to be experiencing similar behavior.

At -O3, the function compiles to:

g:
        xor     eax, eax
        ret

Here, the ompiler notices that it really doesn't need to create a new variable on the stack uselessly, so it zeroes out eax and returns it. Since this is undefined behavior, the compiler can even leave out xor eax, eax and not zero out the return value, though it doesn't seem like it is in your case. x64 Clang 12.0.0 does, however.

Even if it does though, both compilers optimize out the asserts completely. This is perfectly fine, as since there is undefined behavior, the compiler is allowed to do whatever it wants. So it assumes the return value of g will be 0 and optimizes the assert out.

You should however check the assembly generated by your own compiler to confirm this.

Why does -O3 in gcc seem to initialize my local variable to 0, while -O0 does not?

3 Answers3

Linked