35

I have some doubt about type conversion, could you explain me what happens in an expression like this:

unsigned int u = 10; 
int a = -42; 
std::cout << u - a << std::endl;

Here I know that the result will be 52 if I apply the rules when we have two mathematical operators.

However, I wonder what happens when the compiler has to convert a to an unsigned value and creates a temporary object of unsigned type, what happens after? The expression should now be 10 - 4294967254.

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
Piero Borrelli
  • 1,151
  • 2
  • 11
  • 15
  • 6
    Step 1: You get a copy of the C++ or C Standard (latest drafts are free) and check it. Step 2: You decide that you'll never be able to remember the rules and avoid that kind of thing in the future. – gnasher729 Sep 01 '14 at 16:05

3 Answers3

45

In simple terms, if you mix types of the same rank (in the sequence of int, long int, long long int), the unsigned type "wins" and the calculations are performed within that unsigned type. The result is of the same unsigned type.

If you mix types of different rank, the higher-ranked type "wins", if it can represent all values of lower-ranked type. The calculations are performed within that type. The result is of that type.

Finally, if the higher-ranked type cannot represent all values of lower-ranked type, then the unsigned version of the higher ranked type is used. The result is of that type.

In your case you mixed types of the same rank (int and unsigned int), which means that the whole expression is evaluated within unsigned int type. The expression, as you correctly stated, is now 10 - 4294967254 (for 32 bit int). Unsigned types obey the rules of modulo arithmetic with 2^32 (4294967296) as the modulo. If you carefully calculate the result (which can be expressed arithmetically as 10 - 4294967254 + 4294967296), it will turn out as the expected 52.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • sorry i lost myself, when the expressio becomes : unsigned int temporary = 10 - 4294967254 ( ok i've understood this ) but i can't understand why the expression becomes 10 - 4294967254 + 4294967296 (why you add to the expression the modulo arithmetic ? ). – Piero Borrelli Sep 02 '14 at 13:37
  • @Piero Borrelli: One way to calculate the `modulo N` equivalent of a negative value `V` is to add `N` to it as many times as necessary (`V + N`, `V + 2N`, `V + 3N` and so on) until you hit the first non-negative value. In case of C++ additive operations a mathematically negative result needs the modulo value added only once to arrive at the proper unsigned result. – AnT stands with Russia Sep 02 '14 at 14:12
  • @Piero Borrelli: Of course, this is a purely arithmetic rule. The compiler does not have to do anything like that. It does not have to worry about it at all. If the negative values are represented through 2's complement, a simple reinterpretation of that representation as unsigned one immediately provides the correct result. – AnT stands with Russia Sep 02 '14 at 16:17
  • Can you define what you mean by "rank"? C++ doesn't use rank in that way, making this answer ambiguous at best, nonsensical at worst. – Adrian Nov 18 '19 at 15:59
  • @Adrian: Actually, it does. I'm referring to the concept of *integer conversion rank*, as it is used in the description of *usual arithmetic conversions*. The description in my answer is not the exact quote from the standard, since it is intended to be tailored to the specific case of `u - a` from the original question. – AnT stands with Russia Nov 18 '19 at 16:28
10
  1. Due to standard conversion rules, the signed type a is converted to an unsigned type prior to subtraction. That conversion happens according to [conv.integral] p3:

Otherwise, the result is the unique value of the destination type that is congruent to the source integer modulo 2N, where N is the width of the destination type.

Algebraically a becomes be a very large positive number, and certainly larger than u.

  1. u - a is an nameless temporary object and will be of unsigned type. (You can verify this by writing auto t = u - a and inspecting the type of t in your debugger.) Mathematically, this will first be a negative number, but after implicit conversion to the unsigned type, a wraparound rule similar to above is invoked.

In short, the two conversion operations have equal and opposite effects and the result will be 52. In practice, the compiler might optimize out all these conversions.

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
Bathsheba
  • 231,907
  • 34
  • 361
  • 483
-4

Here is the disassemble code which says: first sets -42 to its complement and do the sub operation. So the result is 10 + 42

0x0000000000400835 <+8>:    movl   $0xa,-0xc(%rbp)
0x000000000040083c <+15>:   movl   $0xffffffd6,-0x8(%rbp)
0x0000000000400843 <+22>:   mov    -0x8(%rbp),%eax
0x0000000000400846 <+25>:   mov    -0xc(%rbp),%edx
0x0000000000400849 <+28>:   sub    %eax,%edx
0x000000000040084b <+30>:   mov    %edx,%eax`
phuclv
  • 37,963
  • 15
  • 156
  • 475
  • 8
    In general case disassembled code cannot serve as a meaningful source for understanding the language-level semantics. Code generation is one-way function. It is not possible to "trace it back". i.e. to figure out what the compiler was actually trying to do by looking at generated code. – AnT stands with Russia Sep 01 '14 at 16:23