The compiler is very, very rarely wrong. In this case, b
is overflowing, which is undefined behaviour for signed integers:
$ g++ --version
g++ (GCC) 10.2.0
...
$ g++ -Os -otest test.cpp
test.cpp: In function ‘int main()’:
test.cpp:8:11: warning: iteration 6 invokes undefined behavior [-Waggressive-loop-optimizations]
8 | b += a;
| ~~^~~~
test.cpp:6:24: note: within this loop
6 | for (short i = 0; i<7; ++i)
| ~^~
And if you invoke undefined behaviour, the compiler is free to do whatever it likes, including making your program never terminate.
Edit: Some people seem to think that the UB should only affect the value of b
, but not the loop iteration. This is not according to the Standard (UB can cause literally anything to happen) but it's a reasonable thought, so let's look at the generated assembly to see why the loop doesn't terminate.
First without -Os
:
.LC0:
.string "%d "
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-12], 333666999
mov DWORD PTR [rbp-4], 0
mov WORD PTR [rbp-6], 0
.L3:
cmp WORD PTR [rbp-6], 6 # Compare i to 6
jg .L2 # If greater, jump to end
mov eax, DWORD PTR [rbp-12]
add DWORD PTR [rbp-4], eax
mov eax, DWORD PTR [rbp-4]
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
movzx eax, WORD PTR [rbp-6]
add eax, 1
mov WORD PTR [rbp-6], ax
jmp .L3
.L2:
mov eax, 9
leave
ret
Then with -Os
:
.LC0:
.string "%d "
main:
push rbx
xor ebx, ebx
.L2:
add ebx, 333666999
mov edi, OFFSET FLAT:.LC0
xor eax, eax
mov esi, ebx
call printf
jmp .L2
The comparison and jump instructions are completely gone. Ironically, the compiler did exactly what you asked it to do: optimize for size, so remove as many instructions as it can while obeying the C++ standard. -O3
and -O2
generate the exact same code as -Os
here.
-O1
generates a very interesting output:
.LC0:
.string "%d "
main:
push rbx
mov ebx, 0
.L2:
add ebx, 333666999
mov esi, ebx
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
cmp ebx, -1959298303
jne .L2
mov eax, 9
pop rbx
ret
Here, the compiler optimized away the loop counter i
and just compares the value of b
to its final value after 7 iterations, using the fact that signed overflow happens according to two's complement on this platform! Cheeky, isn't it? :)