This answer explains why compiler optimization behaves this way. In short, it is a consequence of the fact that the transformation is allowed by the rules of the C standard and is desired because it provides better performance for correct programs (programs that do not use behavior that is not defined).
An effect of optimization by GCC is to apply transformations to the program that are valid logical deductions that ignore undefined behavior.
In its default mode, GCC generates code that largely literally follows the source code. For if (a+100 < a)
, GCC generates code like:
- Load
a
into register r0.
- Add 100 to register r0.
- Compare register r0 to
a
.
Thus, GCC actually performs the operations in the expression. Because a
has the value INT_MAX-1
, and the hardware wraps when 100 is added, the result is less than a
, and the comparison evaluates to true so the “then” statement of the if
is executed. (Testing on Godbolt shows this occurs with GCC 7.3 and prior. In GCC 8.1, the behavior with default settings appears to have changed.)
When optimization is requested, GCC creates a semantic model of the program, analyzes it, and applies transformations that produce code that is equivalent within the rules of the C standard (or other language it is compiling).
One mathematical truth for real numbers is that if h is not negative, then a+h < a is always false. While this statement is true for real numbers, it is not true for unsigned
arithmetic, because unsigned
arithmetic wraps. However, it is true for int
arithmetic if overflow does not occur.
Now, if overflow does occur, the behavior is not defined by the C standard. We then have two possibilities:
- If overflow does not occur, the rule is mathematically valid, and using it results in a transformed program that is equivalent within the rules of the standard.
- If overflow does occur, the behavior of the program is not defined by the C standard, so the transformed program is also allowed by the rules of the C standard.
This means we can always apply the rule and ignore whether overflow occurs or not, and we will be conforming to the C standard.
But why do we want to do this? We have just allowed any arbitrary transformation of our program if overflow occurs. Well, there is a good result from this. Sometimes, in the middle of a program, we might find code that, by itself, could have overflow. If we restrained ourselves from performing this optimization, the resulting program would be slower, since it was not optimized. But we can add an assumption: The programmer designed this program correctly. Whatever particular situation we are in in the middle of this program, the programmer should have designed the control flow so that the overflow situation does not happen here. So even though overflow could happen at this spot if this routine were called with, say, x
equal to some particular value, the programmer should have written the program so the routine is never called like that.
Therefore a choice was made to assume that, for the purposes of applying optimizations, overflow does not happen. In consequence, GCC uses the rule (or something similar) that, for signed arithmetic, if h is not negative, then a+h < a is always false.
So, when GCC sees the code a+100 < a
and is optimizing, it replaces this code with 0
, meaning false. Then it further optimizes the if
and removes the “then” statement completely.
Of course, you might ask, well, int a=INT_MAX-1
is just above a+100<a
, cannot the compiler see that and know a+100<a
is “true” in this case? Theoretically, maybe. But computers are not intuitive and do not always look at the whole situation. The compiler may know a
is a constant with a particular value, and it may at times evaluate expressions like a+100<a
at compile time. But it is built with thousands of rules, and it applies them mechanically in some order resulting from its programming. It is not easy to design software to step back and look at the big picture. Once it finds its optimization to change a+100<a
to 0
, it applies it, and the change is done.