Why unsigned is treated as signed?

Question

I know there was very similar question already answered but I believe it doesn't address my problem.

unsigned char aaa = -10;
unsigned int bbb = (unsigned int)-5;
unsigned int ccc = (unsigned int)20 + (unsigned int)bbb;

printf("%d\n", aaa);
printf("%d\n", ccc);

Above code prints aaa = 246 (which is what I would expect) but ccc = 15 which means unsigned int was treated all the way as signed. I can't find explanation for this even trying probably obsolete typecasting.

The `%d` specifier is used for signed integers - it's your responsibility as a programmer to provide the correct type for a specifier — UnholySheep, Apr 25 '20 at 19:40
You're getting unsigned integer overflow with `ccc`. look up how binary addition and two's complement work. — Louis Cloete, Apr 25 '20 at 19:44
I believed that when you try initialise unsigned bbb with -5 you should set it's value to very large positive number. That is obviously not happening but for char aaa it works this way. Why? — Rafal, Apr 25 '20 at 19:45
@Rafal Yes, that's correct. On a 32-bit system, setting bbb to -5 actually stores `2^32 - 5` in bbb. And when you add 20, you get `2^32 + 15`. That number is too big for a 32-bit number. So the result is `(2^32 + 15) modulo 2^32` which is just 15. — user3386109, Apr 25 '20 at 19:48
[Signed to unsigned conversion in C](https://stackoverflow.com/questions/50605/signed-to-unsigned-conversion-in-c-is-it-always-safe) — KamilCuk, Apr 25 '20 at 19:48
Did you try printing `bbb` with the correct conversion specifier `%u`? — Blastfurnace, Apr 25 '20 at 19:48
Thanks, now I get it. There were two problems, one mentioned by @user3386109 and another one using %d in printf. I double checked the value of ccc using for loop to count iterations instead of printf and it makes sense now. — Rafal, Apr 25 '20 at 19:56

score 6 · Answer 1 · edited Apr 25 '20 at 20:11

unsigned char aaa = -10;

The int value -10 will be converted to unsigned char by repeatedly adding UCHAR_MAX + 1 until it the result will in range of [0, UCHAR_MAX]. Most probablly the char has 8-bits on your system, that means UCHAR_MAX = 2**8 - 1 = 255. So -10 + UCHAR_MAX + 1 is 246, it's in range. So aaa = 246.

unsigned int bbb = (unsigned int)-5;

The -5 is added UINT_MAX + 1, assuming int has 32bits, it results in bbb = 4294967291.

unsigned int ccc = (unsigned int)20 + (unsigned int)bbb;

Unsigned integer overflow "wrap around". So 20 + 4294967291 = 4294967311 is greater then UINT_MAX = 2**32 - 1 = 4294967295. So we subtract UINT_MAX+1 until we will be in range [0, UINT_MAX]. So 4294967311 - (UINT_MAX+1) = 15.

printf("%d\n", aaa);

The code is most probably fine. Most probably on your platform unsigned char is promoted to int before passing into variadic function argument. For reference about promotions you could read cppreference implicit conversions. Because %d expects an int and unsigned char is promoted to int, the code is fine. [1]

printf("%d\n", ccc);

This line results in undefined behavior. ccc has the type unsigned int, while %d printf format specifier expects an signed int. Because your platform uses two-s complement to represent numbers, this just results in printf interpreting the bits as a signed value, which is 15 anyway.

[1]: There is a theoretical possibility of unsigned char having as many bits as int, so unsigned char will get promoted to unsigned int instead of int, which will result in undefined behavior there too.

Thank you for detailed explanation. – Rafal Apr 25 '20 at 20:02 — Rafal, Apr 25 '20 at 20:02

score 2 · Answer 2 · edited Apr 25 '20 at 21:26

2

C variable types

C is a quite low-level programming language, and it does not preserve variable types after compilation. For example, if one had both a unsigned and an unsigned variables of equal size (say uint64_t and int64_t), after the compilation ends, each of them will be represented as just an 8-byte piece of memory. All the addition/subtraction will be performed modulo 2 in the corresponding power (64 for 64-bit variables and 32 for 32-bit ones). The only difference, which remains after compilation is in the comparison. For example, (unsigned) -5 > 1, but -5 < 1. I will explain why now

Your -5 will be stored modulo 2^32, just like every 32-bit value. -5 will be presented as 0xfffffffb in actual ram. The algorithm is simple: if a variable is signed, then it's first bit is called a sign bit and indicates, whether it's positive or negative. The first bit of 0xfffffffb is 1, so, when it comes to signed comparison, it is negative, and is less, than 1. But when compared as an unsigned integer, this value is actually a huge one, 2^32 - 5. So, in general, unsigned representation of a negative signed number is greater by 2^[num of bits in that number]. You can read more about this binary arithmetics here. So, all, that happened, you got an unsigned number, equal to 0xfffffffb + 0x14 modulo 2^32, which is 0x10000000f (mod 2^32), and that is 0xf = 15.

In conclusion, "%d" assumes it's argument is signed. But that's not the main reason, why the answer happened. However, I advice using %u for unsigned numbers.

edited Apr 25 '20 at 21:26

user3386109

34,287
7
49
68

answered Apr 25 '20 at 20:23

MinuxLint

111
9

I am sorry for supplying very brief answer. I am a newbie here, and I got confused with the editor usage. I had to delete half of the explanation, as I wanted to include some ascii art, which was weirdly processed by the stackoverflow text engine. All in all, just google integer overflow – MinuxLint Apr 25 '20 at 20:29
To get ASCII art to format correctly, the easiest solution is to format it as a code block. After entering the text, select it, and click the code button `{}` at the top of the edit window. – user3386109 Apr 25 '20 at 21:22
"Your -5 will be stored modulo 2^32" --> not quite. `(unsigned int)-5` converts the `int -5` to an `unsigned`. The conversion here adds `(UINT_MAX+1)` with the resultant sum of 0xFFFFFFFB. That positive value if then saved into `bbb`. This would be the same if -5 was 2's complement, ones' compliment or sign magnitude.. What is save in `bbb` has nothing to do with the signed `int` encoding. – chux - Reinstate Monica Apr 26 '20 at 01:26
@chux-ReinstateMonica, I'm not sure, how is what you typed different from my answer. Every signed number is stored in memory modulo 2**32, and for negative ones this is EXACTLY the same as adding UINT_MAX+1. – MinuxLint Apr 26 '20 at 07:31
In OP's question, there is no negative number stored / no signed number stored. – chux - Reinstate Monica Apr 26 '20 at 13:57

Why unsigned is treated as signed?

2 Answers2

C variable types