Is it undefined behavior if the intermediate result of an expression overflows?

Question

This question is a result of another SO question.

Example Code

#include <iostream>

int main()
{
    unsigned long b = 35000000;
    int i = 100;
    int j = 30000000;
    unsigned long n = ( i * j ) / b; // #1
    unsigned long m = ( 100 * 30000000 ) / b; // #2
    std::cout << n << std::endl;
    std::cout << m << std::endl;
}

Output

85
85

Compiling this code with g++ -std=c++11 -Wall -pedantic -O0 -Wextra gives the following warning:

9:28: warning: integer overflow in expression [-Woverflow]

Questions

Am I correct in thinking that #1 and #2 invoke undefined behavior because the intermediate result 100 * 30000000 does not fit into an int? Or is the output I am seeing well-defined?
Why do I only get a warning with #2?

Correct me if I'm wrong, but I think GCC optimizes your code slightly before actually compiling it, so it may simplify arithmetic expressions. If the size of the resulting number is greater than the size of the data type, it will definitely throw an error. `#1` has `i` and `j`, but you can change `i` and `j` before computing `n`. — Blender, Sep 19 '12 at 04:22
Likely duplicate for http://stackoverflow.com/questions/10882368/overflow-issues-when-implementing-math-formulas — WhozCraig, Sep 19 '12 at 04:22
@CraigNelson no it isn't. The questions are asking different things. — Rapptz, Sep 19 '12 at 04:36
@Blender: Yes, I think you are correct. Adding `const` will allow the compiler to produce a warning with `#1` also because it knows `i` and `j` do not change. — Jesse Good, Sep 19 '12 at 05:01

score 3 · Accepted Answer · answered Sep 19 '12 at 04:22

3

Yes, it is undefined behaviour, and the result you get is usually¹ different if unsigned long is a 64-bit type.

¹ It's UB, so there are no guarantees.

answered Sep 19 '12 at 04:22

Daniel Fischer

181,706
17
308
431

Can you provide a standard reference that this is UB vs say implementation defined? – Mark B Sep 19 '12 at 05:05
2

In C++03, section 5 Expressions, paragraph 5 "If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined, unless such an expression is a constant expression (5.19), in which case the program is ill-formed." I'd be surprised if anything except the location changed in C++11 there. – Daniel Fischer Sep 19 '12 at 05:12

score 2 · Answer 2 · edited Sep 19 '12 at 05:02

2

1) Yes, it's undefined behavior.

2) Because #1 involves variables (not constants), so the compiler in general doesn't know whether it will overflow (although in this case it does, and I don't know why it doesn't warn).

edited Sep 19 '12 at 05:02

Jesse Good

50,901
14
124
166

answered Sep 19 '12 at 04:24

user1610015

6,561
2
15
18

1

No, that was just supplementary. I changed `know know` to `don't know`. – Jesse Good Sep 19 '12 at 07:46

score 2 · Answer 3 · answered Sep 19 '12 at 04:26

Intermediate result

Yes, this is undefined behaviour. What if you just stopped right there and return m? The compiler needs to get from point A to point B, and you've told it to do it by making that calculation (which isn't possible). A compiler may choose to optimize this statement in such a way that you don't get an overflow, but as far as I know, the standard doesn't require the optimizer to do anything.

Why no error when they're variables?

You're explicitly telling gcc not to optimize at all (-O0), so my assumption is that it doesn't know the values of i and j at that point. Normally you'd learn the values because of constant folding, but like I said, you told it not to optimize.

If you re-run this and it still doesn't mention it, there's also the possibility that this warning is generated before the optimizer runs, so it's just not smart enough to do constant folding at all for this step.

Adding `const` produced a warning for `#1`, so it definitely has to do with constant folding, changing the optimization level didn't change anything for non-const variables though. — Jesse Good, Sep 19 '12 at 04:50

score 1 · Answer 4 · edited May 23 '17 at 12:05

You get a warning with two, because the compiler knows the values in the operand. The outputs are right because both use /b which is unsigned long. The temporary value to be divisible by b must be hold greater or equal datatype range, ( i * j ) or ( 100 * 30000000 ) are stored in a CPU register that has the same datatype range of the value to be divided, if b was an int the temporary result would be a int, since b is an ulong, int can't be divided by ulong, the temporary value is stored to an ulong.

It is undefined behavior if it overflows, but it's not overflowing in those cases

A program with the same structure, only changing b to int will have only two lines on .s code.

cltd 
idivl   (%ecx)

to b = int

movl    $0, 
%edx divl   (%ecx)

to b = unsigned long,

idivl performs signed division, storing the value as signed
divl performs unsigned division, storing the value as unsigned

So you're right, the operation does overflows, the output is correct because of the division operation.

What is the difference of idivl and divl?

https://stackoverflow.com/a/12488534/1513286

It does overflow, otherwise gcc wouldn't give the warning `integer overflow in expression`. Like the other answers say, the behavior is undefined. — Jesse Good, Sep 19 '12 at 05:22

score 0 · Answer 5 · answered Sep 19 '12 at 05:38

As for 5/4 the result is undefined behavior.

However note that if you changed the types to unsigned (for the constants just add the u suffix) not only the values do fit, but according to 3.9.1/4 the arithmetic becomes a modulo arithmetic and the result is perfectly defined even for larger intermediate values that do not fit the type.

Is it undefined behavior if the intermediate result of an expression overflows?

5 Answers5

Intermediate result

Why no error when they're variables?

Linked