5

This question is a result of another SO question.

Example Code

#include <iostream>

int main()
{
    unsigned long b = 35000000;
    int i = 100;
    int j = 30000000;
    unsigned long n = ( i * j ) / b; // #1
    unsigned long m = ( 100 * 30000000 ) / b; // #2
    std::cout << n << std::endl;
    std::cout << m << std::endl;
}

Output

85
85

Compiling this code with g++ -std=c++11 -Wall -pedantic -O0 -Wextra gives the following warning:

9:28: warning: integer overflow in expression [-Woverflow]

Questions

  1. Am I correct in thinking that #1 and #2 invoke undefined behavior because the intermediate result 100 * 30000000 does not fit into an int? Or is the output I am seeing well-defined?

  2. Why do I only get a warning with #2?

Community
  • 1
  • 1
Jesse Good
  • 50,901
  • 14
  • 124
  • 166
  • Correct me if I'm wrong, but I think GCC optimizes your code slightly before actually compiling it, so it may simplify arithmetic expressions. If the size of the resulting number is greater than the size of the data type, it will definitely throw an error. `#1` has `i` and `j`, but you can change `i` and `j` before computing `n`. – Blender Sep 19 '12 at 04:22
  • Yes, any sort of signed-integer overflow is UB. – Mysticial Sep 19 '12 at 04:22
  • Likely duplicate for http://stackoverflow.com/questions/10882368/overflow-issues-when-implementing-math-formulas – WhozCraig Sep 19 '12 at 04:22
  • @CraigNelson no it isn't. The questions are asking different things. – Rapptz Sep 19 '12 at 04:36
  • @Blender: Yes, I think you are correct. Adding `const` will allow the compiler to produce a warning with `#1` also because it knows `i` and `j` do not change. – Jesse Good Sep 19 '12 at 05:01

5 Answers5

3

Yes, it is undefined behaviour, and the result you get is usually¹ different if unsigned long is a 64-bit type.

¹ It's UB, so there are no guarantees.

Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431
  • Can you provide a standard reference that this is UB vs say implementation defined? – Mark B Sep 19 '12 at 05:05
  • 2
    In C++03, section 5 Expressions, paragraph 5 "If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined, unless such an expression is a constant expression (5.19), in which case the program is ill-formed." I'd be surprised if anything except the location changed in C++11 there. – Daniel Fischer Sep 19 '12 at 05:12
2

1) Yes, it's undefined behavior.

2) Because #1 involves variables (not constants), so the compiler in general doesn't know whether it will overflow (although in this case it does, and I don't know why it doesn't warn).

Jesse Good
  • 50,901
  • 14
  • 124
  • 166
user1610015
  • 6,561
  • 2
  • 15
  • 18
2

Intermediate result

Yes, this is undefined behaviour. What if you just stopped right there and return m? The compiler needs to get from point A to point B, and you've told it to do it by making that calculation (which isn't possible). A compiler may choose to optimize this statement in such a way that you don't get an overflow, but as far as I know, the standard doesn't require the optimizer to do anything.

Why no error when they're variables?

You're explicitly telling gcc not to optimize at all (-O0), so my assumption is that it doesn't know the values of i and j at that point. Normally you'd learn the values because of constant folding, but like I said, you told it not to optimize.

If you re-run this and it still doesn't mention it, there's also the possibility that this warning is generated before the optimizer runs, so it's just not smart enough to do constant folding at all for this step.

Brendan Long
  • 53,280
  • 21
  • 146
  • 188
  • Adding `const` produced a warning for `#1`, so it definitely has to do with constant folding, changing the optimization level didn't change anything for non-const variables though. – Jesse Good Sep 19 '12 at 04:50
1

You get a warning with two, because the compiler knows the values in the operand. The outputs are right because both use /b which is unsigned long. The temporary value to be divisible by b must be hold greater or equal datatype range, ( i * j ) or ( 100 * 30000000 ) are stored in a CPU register that has the same datatype range of the value to be divided, if b was an int the temporary result would be a int, since b is an ulong, int can't be divided by ulong, the temporary value is stored to an ulong.

It is undefined behavior if it overflows, but it's not overflowing in those cases

A program with the same structure, only changing b to int will have only two lines on .s code.

cltd 
idivl   (%ecx) 

to b = int

movl    $0, 
%edx divl   (%ecx)

to b = unsigned long,

idivl performs signed division, storing the value as signed
divl performs unsigned division, storing the value as unsigned

So you're right, the operation does overflows, the output is correct because of the division operation.

What is the difference of idivl and divl?

https://stackoverflow.com/a/12488534/1513286

Community
  • 1
  • 1
Vinícius
  • 15,498
  • 3
  • 29
  • 53
  • It does overflow, otherwise gcc wouldn't give the warning `integer overflow in expression`. Like the other answers say, the behavior is undefined. – Jesse Good Sep 19 '12 at 05:22
0

As for 5/4 the result is undefined behavior.

However note that if you changed the types to unsigned (for the constants just add the u suffix) not only the values do fit, but according to 3.9.1/4 the arithmetic becomes a modulo arithmetic and the result is perfectly defined even for larger intermediate values that do not fit the type.

Analog File
  • 5,280
  • 20
  • 23