why overflow happens on calculation depending on the data type when the type where the value is being assigned can hold it

Question

Earlier I came up with something, which I solved, but it got me later let's take a look at a similar example of what I was on:

int b = 35000000; //35million
int a = 30000000;
unsigned long n = ( 100 * a ) / b;

Output: 4294967260

I simply changed a to unsigned long and the correct 85% output would come up, because a is a signed 32bit integer. But this got me later. There is no value assignment to a during ( 100 * a ) there is just simply a calculation and the correct value which is 3billion should come up instead of an overflow. To understand if there wasn't really an assignment to a I removed a from the code and manually write the value instead instead:

int b = 35000000;
unsigned long n = ( 100 * 30000000 ) / b;

The big surprise was that the output is also: 4294967260
And of course value of 3billion can be assigned to an unsigned long. My first thought was that ( 100 * 30000000 ) was causing an overflow, but then I asked "an overflow on what? there is nothing to be overflowed". Then I changed b to unsigned long, which even most suprisingly the output was correct 85%.

In the first example changing a to unsigned long

int b = 35000000;
unsigned long a = 30000000;
unsigned long n = ( 100 * a ) / b;

and leaving bas an int as it is works, but on the second example it doesn't, what is occuring?

This might be a little overwhelming to let me re-write all examples with the ones who work and the ones who dont.

Works (Output = 85):

int b = 35000000;
unsigned long a = 30000000;
unsigned long n = ( 100 * a ) / b;

Works (Output = 85):

unsigned long b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;

Doesn't works (Overflow):

int b = 35000000;
int a = 30000000;
unsigned long n = ( 100 * a ) / b;

Doesn't works (Overflow):

int b = 35000000;
unsigned long n = ( 100 * 30000000 ) / b;

100 is an int. 30000000 is an int. Therefore, 100 * 30000000 is an int. Some other languages might put the overflowed result into a larger type, but not C++. — chris, Sep 19 '12 at 02:55
@chris, then why on `unsigned long b = 35000000`, `unsigned long n = ( 100 * 30000000 ) / b;` the value of `n` is correct? — Vinícius, Sep 19 '12 at 02:57
"an overflow of what?" - the CPU register in which the arithmetic's performed.... — Tony Delroy, Sep 19 '12 at 02:58
@ViniyoShouta, There's probably a better explanation, but signed overflow *is* undefined behaviour. — chris, Sep 19 '12 at 02:59
I understood that it assumes the temporary result should be stored in a register that holds 32bit signed integer because of the values on the calculation, which in my opinion is something that should have been modified after so many years...but why `/b` when `b` is `unsigned long` outputs correctly is still an interesting question — Vinícius, Sep 19 '12 at 03:08

score 3 · Accepted Answer · answered Sep 19 '12 at 03:26

Let me explain what is occuring here.
On:

int b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;

The value is incorrect because overflow happens at ( 100 * 30000000 ) But on:

unsigned long b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;

The value is correct, so what is happening?

In the first example b is a int, as said by Tony, an overflow happens because the register where the temporary value of ( 100 * 30000000 )will be assigned is able to hold 32bit signed integers, that happens because 100 is an int and 30000000 is also an int AND because b is also an int, the register in this case are smart, when ALL values on the right side are int it assumes the values also have to be an int but when a mighty unsigned long comes to the party, it knows that dividing an int by an unsigned long, / b is wrong, so it stores the value of ( 100 * 30000000 ) to an unsigned long.

Your question made me curious, so I [asked another question](http://stackoverflow.com/questions/12488522/is-it-undefined-behavior-if-the-intermediate-result-of-an-expression-overflows). — Jesse Good, Sep 19 '12 at 04:20

score 2 · Answer 2 · edited Jun 20 '20 at 09:12

2

In C++, there are programming elements called "literal constants".

For example (taken from here):

157 // integer constant

0xFE // integer constant

'c' // character constant

0.2 // floating constant

0.2E-01 // floating constant

"dog" // string literal

So, back to your example, 100 * 30000000 is multiplying two ints together. That is why there is overflow. Anytime you perform arithmetic operations on operands of the same type, you get a result of the same type. Also, in the snippet unsigned long a = 30000000;, you are taking an integer constant 30000000 and assigning that to the variable a of type unsigned long.

To get your desired output, add the ul suffix to the end: unsigned long n = ( 100ul * 30000000ul ) / b;.

Here is a site that has explanations for the suffixes.

why /b when b is unsigned long is still an interesting question

Because 100 * 30000000 is performed before you divide by b and the operands are both of type int.

edited Jun 20 '20 at 09:12

Community

1
1

answered Sep 19 '12 at 03:06

Jesse Good

50,901
14
124
166

Forgive me if I'm wrong, but isn't the **suffix** for unsigned a capital `U`? – chris Sep 19 '12 at 03:10
`Because 100 * 30000000 is performed before you divide by b and the operands are both of type int.` but didn't an overflow already happened `(100 * 30000000) / b` before the value was divided by `b`? – Vinícius Sep 19 '12 at 03:11
@ViniyoShouta: Yes, the overflow already happens before you divide by `b`. – Jesse Good Sep 19 '12 at 03:13
so why when `b` is `unsigned long`, `unsigned long n = ( 100 * 30000000 ) / b;` is correct? (this is what I was saying that is an interesting question) – Vinícius Sep 19 '12 at 03:14
@chris: Where you asking if it needs to be a capital? Both are fine. – Jesse Good Sep 19 '12 at 03:14
@ViniyoShouta: `unsigned long n = ( 100 * 30000000 ) / b;` is not correct. [See here](http://ideone.com/mMtMY). – Jesse Good Sep 19 '12 at 03:15
@JesseGood, Oh, good point. For some reason I thought only one of the two were ever available. Your initial one is also fine combining u with l. – chris Sep 19 '12 at 03:16
You posted it with an `int` Jesse :-P, but don't worry I already got why, thanks Jesse! – Vinícius Sep 19 '12 at 03:17
@ViniyoShouta: Sorry, my mistake. Try compiling with optimizations turned off (I think that `100 * 30000000` gets optimized out). – Jesse Good Sep 19 '12 at 03:21
I answered my own question, thank you for the `ul` tip, I didn't know that! – Vinícius Sep 19 '12 at 03:29

score 1 · Answer 3 · answered Sep 19 '12 at 03:18

The maximum number that can be represented in a 32-bit signed integer without overflow is 2147483647. 100*30000000 is larger than that.

The type of an arithmetic operation is completely independent of the type of the variable you're storing it into. It's based on the type of the operands. If both operands are of type int, the result will be of type int too, and that result will then be converted before it is stored in the variable.

score 1 · Answer 4 · edited May 23 '17 at 10:29

1

Works (Output = 85):

unsigned long b= 35000000;
unsigned long n = ( 100 * 30000000 ) / b;

Not here, using:

#include <iostream>

int main() {
    unsigned long b= 35000000;
    unsigned long n = ( 100 * 30000000 ) / b;
    std::cout << n << std::endl;
    return 0;
}

the output is 527049830640 (and the compiler warned about the overflow even with the default warning level).

The point is that, as Mark Ransom already wrote, the type of an arithmetic operation is determined by the type of its operands.

The type of the constant 100 is int, as is the type of the constant 30000000 (assuming 32-bit or larger ints, would be long int if int is 16 bits). So the multiplication is performed at type int, and with 32-bit ints it overflows. The overflow is undefined behaviour, but wrap-around is the most common manifestation of that undefined behaviour, resulting in the value -1294967296. Then the result of the multiplication is converted to the type of b (since that is an unsigned type and - in C terminology - its integer conversion rank is not smaller than that of int) for the division.

Conversion to an unsigned integer type means reduction modulo 2^WIDTH. If the width of unsigned long is 32, the result of that last conversion is 2^32 - 1294967296 = 3000000000, resulting in the quotient 85. But if - as on my system - the width of unsigned long is 64 bits, the result of that conversion is 2^64 - 1294967296 = 18446744072414584320.

edited May 23 '17 at 10:29

Community

1
1

answered Sep 19 '12 at 04:20

Daniel Fischer

181,706
17
308
431

I'm using VS2003, microsoft compiler, what are you using? – Vinícius Sep 19 '12 at 05:02
In mine unsigned long is also 64bit, I suppose it sets `( 100 * 30000000 )` to an integer instead of a more ranged datatype, which doesn't happens to me because it knows to be divisible by `b` it must have great or equal range – Vinícius Sep 19 '12 at 05:07
Hm, I thought the Microsoft compiler has 32-bit `(unsigned) long` even on 64-bit Windows. But `( 100 * 30000000 )` is an expression of type `int`, there's no way around that. However, it is a constant expression, and as I've just read, having a constant expression overflow makes the program ill-formed, so a diagnostic is required. – Daniel Fischer Sep 19 '12 at 05:19
Yes it is 64bit, `movl $35000000, -4(%ebp) movl $-1294967296, %edx` – Vinícius Sep 19 '12 at 05:26
That looks like a 32-bit result. What does `std::cout << sizeof(unsigned long) << std::endl;` say? – Daniel Fischer Sep 19 '12 at 05:28

Mike DeSimone · Answer 5 · 2012-09-19T03:23:08.090

Another common solution is to typecast one of the constants to the larger, result type prior to operating on it. I prefer this method, since not everyone remembers all the possible suffixes. Including myself.

In this case, I'd use:

unsigned long n = ( (unsigned long)100 * 30000000 ) / b;

The sad part is that this is one thing assembly language—yes, assembly language—gets right that C, C++, and many other languages do not: The result of multiplying an M-bit integer by an N-bit integer is a (M+N)-bit integer, not a (max(M, N))-bit integer.

EDIT: Mark makes an interesting point: the compiler does not "look ahead" to where the result is stored in order to infer a result type. Thus, C++ demands that the result of any sub-expression, by itself, be deterministic. In other words, the exact type of 100 * 30000000 can always be determined without looking at any other piece of code.

why overflow happens on calculation depending on the data type when the type where the value is being assigned can hold it

5 Answers5

Linked