21

After getting advised to read "C++ Primer 5 ed by Stanley B. Lipman" I don't understand this:

Page 66. "Expressions Involving Unsigned Types"

unsigned u = 10;
int i = -42;
std::cout << i + i << std::endl; // prints -84
std::cout << u + i << std::endl; // if 32-bit ints, prints 4294967264

He said:

In the second expression, the int value -42 is converted to unsigned before the addition is done. Converting a negative number to unsigned behaves exactly as if we had attempted to assign that negative value to an unsigned object. The value “wraps around” as described above.

But if I do something like this:

unsigned u = 42;
int i = -10;
std::cout << u + i << std::endl; // Why the result is 32?

As you can see -10 is not converted to unsigned int. Does this mean a comparison occurs before promoting a signed integer to an unsigned integer?

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
Alex24
  • 600
  • 2
  • 13
  • 16
    `As you can see -10 is not converted to unsigned int.` It is. – tkausl Jan 14 '19 at 22:28
  • 2
    Google about binary numbers and the way they are represented, in particular signedness. Then all shall become clear. – DeiDei Jan 14 '19 at 22:33
  • 2
    What result were you expecting instead of 32? – Barmar Jan 15 '19 at 01:18
  • Possibly related: [c++ safeness of code with implicit conversion between signed and unsigned](https://stackoverflow.com/questions/53181174/c-safeness-of-code-with-implicit-conversion-between-signed-and-unsigned) – francesco Jan 15 '19 at 12:27
  • Thank you all guys. You make it clear for me now. – Alex24 Jan 16 '19 at 17:51

5 Answers5

26

-10 is being converted to a unsigned integer with a very large value, the reason you get a small number is that the addition wraps you back around. With 32 bit unsigned integers -10 is the same as 4294967286. When you add 42 to that you get 4294967328, but the max value is 4294967296, so we have to take 4294967328 modulo 4294967296 and we get 32.

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • What I find interesting with modulo arithmetic is that by "convoluted" means, you get the same answer as regular arithmetic. This is one reason to justify wrapping behavior for overflow handling. – Matthieu M. Jan 15 '19 at 09:08
  • @MatthieuM. The "convolution" only happens when you think of unsigned integers as numbers. The fact that adding a number from the equivalence class of 42 to a number from the equivalence class of -10 yields a value from the equivalence class of 32 is no less intuitive than the signed result IMO. – Baum mit Augen Jan 15 '19 at 12:31
  • So easy and simple to understand thank you. One thing: Why `short` and `char` are not promoted to `unsigned`? – Alex24 Jan 16 '19 at 18:14
  • 3
    @Alex24 You're welcome. `short` and `char` get promoted to `int` as that is the smallest built in type that the mathematical operators use. The only time where that could be different is when you have a `unsigned char` or `unsigned short` and it has the same size as `int`. Then it would be promoted to an `unsigned int`. – NathanOliver Jan 16 '19 at 18:34
  • @NathanOliver: Perfect! thank you again. Really got your point. – Alex24 Jan 16 '19 at 20:50
18

Well, I guess this is an exception to "two wrongs don't make a right" :)

What's happening is that there are actually two wrap arounds (unsigned overflows) under the hood and the final result ends up being mathematically correct.

  • First, i is converted to unsigned and as per the wrap around behavior the value is std::numeric_limits<unsigned>::max() - 9.

  • When this value is summed with u the mathematical result would be std::numeric_limits<unsigned>::max() - 9 + 42 == std::numeric_limits<unsigned>::max() + 33 which is an overflow and we get another wrap around. So the final result is 32.


As a general rule in an arithmetic expression if you only have unsigned overflows (no matter how many) and if the final mathematical result is representable in the expression data type, then the value of the expression will be the mathematically correct one. This is a consequence of the fact that unsigned integers in C++ obey the laws of arithmetic modulo 2n (see bellow).


Important notice. According to C++ unsigned arithmetic does not overflow:

§6.9.1 Fundamental types [basic.fundamental]

  1. Unsigned integers shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer 49

49) This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type.

I will however leave "overflow" in my answer to express values that cannot be represented in regular arithmetic.

Also what we colloquially call "wrap around" is in fact just the arithmetic modulo nature of the unsigned integers. I will however use "wrap around" also because it is easier to understand.

bolov
  • 72,283
  • 15
  • 145
  • 224
  • Unsigned arithmetic does not overflow. – Baum mit Augen Jan 14 '19 at 22:50
  • afaik it does and it is well defined – bolov Jan 14 '19 at 22:51
  • 3
    @BaummitAugen what? of course it overflows. Add two large unsigned ints, you can't represent a number that doesn't fit in the number format – Garr Godfrey Jan 14 '19 at 22:51
  • No. It can represent any integer by picking the representative of its equivalence class that lies in [0, 2^NumBit - 1]. Overflow occurs when some result cannot be represented. That cannot happen in unsigned arithmetic. – Baum mit Augen Jan 14 '19 at 22:52
  • @GarrGodfrey Yes it can represent the result. You are misunderstanding the arithmetic it implements. – Baum mit Augen Jan 14 '19 at 22:52
  • We are talking about C++. unsigned int is a fixed size. It cannot expand infinitely. – Garr Godfrey Jan 14 '19 at 22:55
  • @BaummitAugen unsigned numbers to over/underflow. They just do so in a well defined manner. – NathanOliver Jan 14 '19 at 22:58
  • 2
    @BaummitAugen I went to the standard to prove my point. Turns out I was wrong. Thank you. – bolov Jan 14 '19 at 23:00
  • @GarrGodfrey Every integer is in some equivalence class of Z / n for any natural number n. Finite bits suffice. – Baum mit Augen Jan 14 '19 at 23:00
  • "_if the final mathematical result_" result of WHAT operation? – curiousguy Jan 14 '19 at 23:03
  • @curiousguy any arithmetic expression. I edited to be more clear. Hope now it is. – bolov Jan 14 '19 at 23:05
  • 1
    it's a bit ridiculous to argue losing the high order bits doesn't count as overflow just because the low order bits are still accurate. – Garr Godfrey Jan 14 '19 at 23:08
  • 2
    @GarrGodfrey I searched the standard and he is indeed correct. There is no overflow. And this is because unsigned integers do not represent natural numbers, but, as BaummitAugen said, they represent equivalence classes. – bolov Jan 14 '19 at 23:10
  • @bolov "_they represent equivalence classes_" then why would anyone use them to represent a number of bytes, a number of objects in a container, etc.? – curiousguy Jan 15 '19 at 01:50
  • @curiousguy because they are useful like that – bolov Jan 15 '19 at 01:54
  • @bolov At least with signed integers the compiler is allowed to stop programs that overflow. Overflow is most of the time unintentional and the sign of a bug. (A bug that could easily be a security vulnerability.) Defining overflow has wrap around is sometimes useful but can hide bugs. If people rely on it being defined, you can't add a debugging mode later that diagnoses it because of the false alerts. – curiousguy Jan 15 '19 at 01:58
  • 2
    @curiousguy I've heard some ppl (experts) saying that having wraparound behavior for unsigned was a bad decision in hindsight. They main reason however was the optimization it inhibits. They argued for regular unsigned to have undefined behavior on overflow and to exist some other data type that had wraparound behavior. But it is what it is. – bolov Jan 15 '19 at 02:07
  • 2
    @curiousguy *"then why would anyone use them to represent a number of bytes, a number of objects in a container, etc.?"* Several prominent members of the committee consider exactly that a historical mistake. https://channel9.msdn.com/Events/GoingNative/2013/Interactive-Panel-Ask-Us-Anything 9:50, 42:40, 1:02:50 – Baum mit Augen Jan 15 '19 at 09:14
6

i is in fact promoted to unsigned int.

Unsigned integers in C and C++ implement arithmetic in ℤ / 2nℤ, where n is the number of bits in the unsigned integer type. Thus we get

[42] + [-10] ≡ [42] + [2n - 10] ≡ [2n + 32] ≡ [32],

with [x] denoting the equivalence class of x in ℤ / 2nℤ.

Of course, the intermediate step of picking only non-negative representatives of each equivalence class, while it formally occurs, is not necessary to explain the result; the immediate

[42] + [-10] ≡ [32]

would also be correct.

Baum mit Augen
  • 49,044
  • 25
  • 144
  • 182
3

"In the second expression, the int value -42 is converted to unsigned before the addition is done"

yes this is true

unsigned u = 42;
int i = -10;
std::cout << u + i << std::endl; // Why the result is 32?

Supposing we are in 32 bits (that change nothing in 64b, this is just to explain) this is computed as 42u + ((unsigned) -10) so 42u + 4294967286u and the result is 4294967328u truncated in 32 bits so 32. All was done in unsigned

bruno
  • 32,421
  • 7
  • 25
  • 37
2

This is part of what is wonderful about 2's complement representation. The processor doesn't know or care if a number is signed or unsigned, the operations are the same. In both cases, the calculation is correct. It's only how the binary number is interpreted after the fact, when printing, that is actually matters (there may be other cases, as with comparison operators)

-10 in 32BIT binary is FFFFFFF6
42 IN 32bit BINARY is  0000002A

Adding them together, it doesn't matter to the processor if they are signed or unsigned, the result is: 100000020. In 32bit, the 1 at the start will be placed in the overflow register, and in c++ is just disappears. You get 0x20 as the result, which is 32.

In the first case, it is basically the same:

-42 in 32BIT binary is FFFFFFD6
10 IN 32bit binary is 0000000A

Add those together and get FFFFFFE0

FFFFFFE0 as a signed int is -32 (decimal). The calculation is correct! But, because it is being PRINTED as an unsigned, it shows up as 4294967264. It's about interpreting the result.

Garr Godfrey
  • 8,257
  • 2
  • 25
  • 23
  • 1
    As a side note, on x86, most arithmetic operations set various bits in the flags register, which can then be used to e.g. perform branching. For example, addition would set CF (carry flag) to signify that unsigned "wrapping" has occurred, and/or OF (overflow flag) to signify that signed overflow has occurred, if you were to consider the operands signed. There was even a 1-byte `INTO` instruction which generated a software interrupt if OF was set, which could help with debugging, but unfortunately was not supported from higher level languages, and is no longer available in 64 bit mode. – Arne Vogel Jan 15 '19 at 10:45
  • 1
    Technically speaking, nothing mandates 2's-complement representation. It's just that the modular-arithmetic conversion specified by the standard is exactly equivalent to using 2's-complement (as you'll see if you perform the conversion on a sign-magnitude or 1's-complement system, if it's correct). – Toby Speight Jan 15 '19 at 11:35
  • implementing an adder in 1's-compliment or sign-magnitude is very complicated vs 2's compliment. Not required, except for sanity. – Garr Godfrey Jan 15 '19 at 18:41