0

I find very strange the differences between the assembler results of the following code compiled without optimization and with -Os optimization.

#include <stdio.h>

int main(){
    int i;

    for(i=3;i>2;i++);

    printf("%d\n",i);

    return 0;
}

Without optimization the code results:

000000000040052d <main>:
  40052d:   55                      push   %rbp
  40052e:   48 89 e5                mov    %rsp,%rbp
  400531:   48 83 ec 10             sub    $0x10,%rsp
  400535:   c7 45 fc 03 00 00 00    movl   $0x3,-0x4(%rbp)
  40053c:   c7 45 fc 03 00 00 00    movl   $0x3,-0x4(%rbp)
  400543:   eb 04                   jmp    400549 <main+0x1c>
  400545:   83 45 fc 01             addl   $0x1,-0x4(%rbp)
  400549:   83 7d fc 02             cmpl   $0x2,-0x4(%rbp)
  40054d:   7f f6                   jg     400545 <main+0x18>
  40054f:   8b 45 fc                mov    -0x4(%rbp),%eax
  400552:   89 c6                   mov    %eax,%esi
  400554:   bf f4 05 40 00          mov    $0x4005f4,%edi
  400559:   b8 00 00 00 00          mov    $0x0,%eax
  40055e:   e8 ad fe ff ff          callq  400410 <printf@plt>
  400563:   b8 00 00 00 00          mov    $0x0,%eax
  400568:   c9                      leaveq 
  400569:   c3                      retq  

and the output is: -2147483648 (as I expect on a PC)

With -Os the code results:

0000000000400400 <main>:
  400400:   eb fe                   jmp    400400 <main>

I think the second result is an error!!! I think the compiler should have compiled something corresponding to the code:

printf("%d\n",-2147483648);
Sir Jo Black
  • 2,024
  • 2
  • 15
  • 22
  • 7
    No it is not. There is no use in examining _undefined behaviour_. Your `for` loop will eventually overflow a signed integer -> UB! – too honest for this site Sep 28 '15 at 14:59
  • `i` becomes less than 2 when it reaches the maximum positive value, then it (with two complement notation) becomes the maximum absolute value a negative number can represent! – Sir Jo Black Sep 28 '15 at 15:06
  • 3
    That behaviour is defined only for *unsigned* – Weather Vane Sep 28 '15 at 15:08
  • My thinking is that with or without optimizations the compiled result behaviour (in this case) should be the same! – Sir Jo Black Sep 28 '15 at 15:09
  • No it does not. **Signed** integer overflow is UB! Please read http://port70.net/~nsz/c/c11/n1570.html and think about the implications of the word "**undefined**" There is no use in childish "but I want it to". – too honest for this site Sep 28 '15 at 15:10
  • 1
    @SergioFormiggini: "But I wanna!". *My thinking* is that you should stick with Python or something if you can't accept what C is. – EOF Sep 28 '15 at 15:10
  • The compiler can assume that signed overflow never happens, so the value can never get below 2 - and then you have `for( ; true; );`. – Bo Persson Sep 28 '15 at 15:10
  • The problem is that the compiler acts in two different ways (and is always the same compiler), furthermore the compiler doesn't signal any warning! – Sir Jo Black Sep 28 '15 at 15:11
  • Please use a different language then! – too honest for this site Sep 28 '15 at 15:12
  • I use C since it was ... – Sir Jo Black Sep 28 '15 at 15:13
  • 5
    Undefined behavior means that **anything** can happen. Absolutely anything, including producing different results each time. Or no result at all. – Bo Persson Sep 28 '15 at 15:13
  • But I think the compiler knows the machine where it works ... Old compilers acted considering the number notation of the CPU! – Sir Jo Black Sep 28 '15 at 15:15
  • The C int act as the word of the CPU! But in modern time we prefer to forget this! (remember jb, ja, jg, jl instruction and don't reply me the register of the CPU are unsigned) – Sir Jo Black Sep 28 '15 at 15:19
  • 3
    gcc has the options `-ftrapv` which makes signed overflow crash and `-fwrapv` which makes it wrap round. – Timothy Baldwin Sep 28 '15 at 15:19
  • I'd like to ask the opinion of Kernighan and Ritchie! – Sir Jo Black Sep 28 '15 at 15:21
  • @Weather Vane, thanks for the info about -fwrapv and -ftrapv! – Sir Jo Black Sep 28 '15 at 15:29
  • Not me, you should thank @TimothyBaldwin. – Weather Vane Sep 28 '15 at 15:30
  • @ Timothy Baldwin, thanks for the info about -fwrapv and -ftrapv! – Sir Jo Black Sep 28 '15 at 15:31
  • @Timothy Baldwin, the code below runs without exception both using -fwrapv and -ftrapv! Is this correct!? – Sir Jo Black Sep 28 '15 at 15:37
  • @SergioFormiggini _"My thinking is that with or without optimizations the compiled result behaviour (in this case) should be the same!"_ No, that's not how **undefined behaviour** works. The standard leaves many things undefined to help optimisation to produce better code, so in this case you don't get the same code with/without optimisation. With optimisation enabled the compiler can do extra control flow analysis and so is able to understand the code (and any potential undefined behaviour) better. Optimisation should not change the result of **correct** programs, but yours is not correct. – Jonathan Wakely Sep 28 '15 at 15:37
  • GCC's `-ftrapv` doesn't work. It works better in Clang. – Jonathan Wakely Sep 28 '15 at 15:39
  • @Jonathan Wakely, Ok! That you say about optimizations is what I say about other aspect of programming relevant to optimizations results ... If is assumed that is a UB I've to accept this fact! But I should prefer this case was defined :) – Sir Jo Black Sep 28 '15 at 15:41
  • 1
    C, the spec, doesn't guarantee that ints are implemented using two's complement. Yes, in i386, there's two's complement arithmetic, but that might not be true for other architectures, so C leaves `INT_MAX+1` undefined. GCC takes advantage of that in optimizations, regardless of whether the target architecture is two's comp, unless you tell it not to via `-fwrapv`. – Colonel Thirty Two Sep 28 '15 at 15:47
  • [Why does integer overflow on x86 with GCC cause an infinite loop?](http://stackoverflow.com/questions/7682477/why-does-integer-overflow-on-x86-with-gcc-cause-an-infinite-loop). Signed overflow invokes UB, as others said, and the compiler is free to decide what to do. In this case an infinite loop – phuclv Sep 28 '15 at 15:48
  • Honestly, in this case, I would have preferred that the behavior was defined as CPU-dependent! But if it's defined UB I've to accept this! ... I've to move my knowledge beyond ... :) – Sir Jo Black Sep 28 '15 at 15:58

2 Answers2

7

Compiler is working as it should.

Signed integer overflow is illegal in C, and results in undefined behaviour. Any program that relies on it is broken.


Compiler replaces for(i=3;i>2;i++); with while(1);, because it sees that i starts from 3 and only increases, so value can never be less than 3.

Only overflow could result in loop exit. But that is illegal and compiler assumes that you would never do such a dirty thing.

Because there is infinite loop, printf is never reached and can be removed.


Unoptimized version worked only by accident. Compiler could have done the same thing there and it would have been equally valid.

user694733
  • 15,208
  • 2
  • 42
  • 68
  • The unoptimized version works because the C int behaviour is the same of the CPU ... – Sir Jo Black Sep 28 '15 at 15:17
  • @SergioFormiggini Yes, and that is an accident. – Douglas Leeder Sep 28 '15 at 15:18
  • @Sergio - It is still undefined, but one of the *many* possible results is to display the value you expected. That's allowed too. – Bo Persson Sep 28 '15 at 15:19
  • I'd like to ask the opinion of Kernighan and Ritchie! – Sir Jo Black Sep 28 '15 at 15:21
  • @SergioFormiggini Note that unoptimized version would have worked unexpectedly also on other than 2's complement systems. – user694733 Sep 28 '15 at 15:21
  • @user694733. For example! – Sir Jo Black Sep 28 '15 at 15:22
  • 1
    @SergioFormiggini: C99 and K&R C are now quite different implementations. Much of current code would be rejected by a K&R compiler, and much of K&R code would also be rejected by a C99 conformant compiler! I gave reference to (draft of) current standard in my answer explaining why you invoked undefined behaviour and as such got undefined results. I agree with you, and really dislike this compiler behaviour but it **is** standard conformant. – Serge Ballesta Sep 28 '15 at 15:29
  • It will be because I use MCUs and then when I write code for an MCU I know what I'm doing that I prefer a CPU-compatible behaviour. I prefer to know that the result is CPU dependent more then to assume UB. – Sir Jo Black Sep 28 '15 at 15:54
  • @Serge Ballesta, It's a standard, we have to accept! Honestly, I think the idea the actual standard has about to manage some "UB" is too strong. – Sir Jo Black Sep 28 '15 at 23:26
  • @user694733, is illegal in modern standards, not in K&R C and not in C89 (I think). – Sir Jo Black Sep 28 '15 at 23:28
4

Well, the compiler is allowed to assume that the program will never exhibit undefined behaviour.

You get INT_MIN in the first case, because you have an overflow when INT_MAX + 1 gives INT_MIN (*), but this is undefined behaviour. And the C99 draft (n1556) says at 6.5 Expressions §5: If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.

So compiler can say:

  • loop starts with an index value greater than the limit
  • index is always increased
  • if no UB occurs, index will always be greater than the limit => this is an infinite loop

With the as-if rule (5.1.2.3 Program execution §3 An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced), it can replace your loop with an infinite loop. Following instructions can no longer be reached and can be removed.

You invoked undefined behaviour and got... undefined behaviour.

(*) and even this is plainly implementation dependant, INT_MIN could be -2147483647if you had 1's complement, 8000000 could be a negative 0, or overflow could raise a signal...

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252