10

I thought that unsigned int could store only integers >= 0. But I tried assigning a negative to an unsigned int, nothing special happened. It seems like it stored the value with no problem.

So what is the difference between signed and unsigned int, and what's the point if it can store any value anyway?

Danny
  • 125
  • 1
  • 7
  • 2
    Tip: By default, most C compilers are very reticent with providing unsolicited warnings. Ask it to speak up, and you will be astounded. – Deduplicator Dec 22 '18 at 22:57
  • 5
    "_stored the value with no problem_": there _was_ a problem. – DYZ Dec 22 '18 at 22:58
  • Duplicate answer???: https://stackoverflow.com/questions/5169692/assigning-negative-numbers-to-an-unsigned-int – Hayden Dec 22 '18 at 22:59
  • Please share an [mcve] of your code to show us what you're doing. – Some programmer dude Dec 22 '18 at 23:01
  • 1
    Try this: `int main(){ unsigned int a = -1; if( 2U < a ){ printf("2 < '-1'\n"); }}` – datenwolf Dec 22 '18 at 23:04
  • 1
    If you present the code by which you purport to show that an unsigned integer stores a negative number, we will point out to you where it relies on undefined behavior, or else simply does not show what you claim. – John Bollinger Dec 22 '18 at 23:42
  • There are lots of impacts of the differences such as [integral promotions](https://stackoverflow.com/q/24371868/1708801) and [signed overflow causing undefined behavior](https://stackoverflow.com/a/24297811/1708801) – Shafik Yaghmour Dec 23 '18 at 00:05
  • Depends on what you call a problem. A negative value is, on storing into an unsigned variable, converted to a positive in a defined manner (wrapping, or modulo arithmetic) so `-1` will become the maximum value the unsigned type can represent (a pretty large value). If a negative value is stored in an unsigned variable, the result will not be negative. – Peter Dec 23 '18 at 00:21

5 Answers5

9

A statement like

unsigned int t = -1;
printf("%u", t);

is completely legal and well defined in C. Negative values, when assigned to an unsigned integral type, get implicitly converted (cf. for example, this online C standard draft):

6.3.1.3 Signed and unsigned integers

(2) Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

The output of above program is an unsigned value, i.e.

4294967295

So you can assign "negative" values to unsigned integral types, yet the result is not a negative value in its actual sense. This is particularly relevant when you compare unsigned integral values to negative values. Consider, for example, the following two loops:

int i = 10;
while (--i >= 0) {  // 10 iterations
    printf("i: %d\n", i);
}

unsigned int u = 10;
while (--u >= 0) {  // endless loop; warning provided.
    printf("u: %u\n", u);
}

The first will finish after 10 iterations, whereas the second will never end: Unsigned integral values cannot become negative, so u >= 0 is always true.

Stephan Lechner
  • 34,891
  • 4
  • 35
  • 58
3

The point of using unsigned int in C is that:

  • It gives you more range for positive values (at least 32,767 for signed vs at least 65,535 for unsigned)
  • It gives you the ability to use the number for masking and avoid undefined behavior while bit-shifting the number
  • It lets the compiler do the checking for you that you are not assigning improper values to the number (if you know it's supposed to be unsigned) which is what would have happened in your case had you compiled with warnings turned on.
mnistic
  • 10,866
  • 2
  • 19
  • 33
  • 3
    @Fureeish, 2s complement is a convention for representing *signed* numbers. By definition, then, representations of unsigned types *do not* use it. And per C, representations of unsigned numbers do not have a sign bit. On the other hand, that does not give a larger range of numbers, or more numbers, but rather a *different* range of numbers, and more positive ones. – John Bollinger Dec 22 '18 at 23:40
  • 1
    Only answer that mentioned undefined bit-shift behavior in signed integers. – fdk1342 Dec 22 '18 at 23:47
  • @JohnBollinger Fine point on ["and more positive ones"](https://stackoverflow.com/questions/53899902/whats-the-point-using-unsigned-int-in-c#comment94645757_53899987). The positive range of an unsigned type like `unsigned long` is allowed to be the same. Example: `ULONG_MAX == LONG_MAX`, even though `ULONG_MAX/2 == LONG_MAX` is [far far](https://en.wikipedia.org/wiki/Star_Wars_opening_crawl) more common. I have not seen a platform employ this archaic feature for years, as it implies a padded unsigned type, and certainly never will see again. – chux - Reinstate Monica Dec 23 '18 at 05:42
  • That's the first time I've heard of such an implementation, @chux. Interesting. I see that the standard allows it, but only if the range of the signed type is larger than the minimum required range for that type. But as it relates to the question at hand, even an unsigned type whose maximum representable value is not greater than that of the corresponding signed type enjoys some guarantees about its behavior that the corresponding signed type does not have (as I know you are aware). – John Bollinger Dec 23 '18 at 14:17
  • 1
    @JohnBollinger The singular occurrence I experienced was a wider than 32-bit type that was effectively some signed `intN_t` and a (N-1)-bit unsigned as simply the "sign" bit was a padding bit. It had to do with the processor natively supporting `*,/` of the signed, but not unsigned. Today, such a model would find too much push-back from the user community and given we do not see it today implies the Darwin pressure relegated that to the computer graveyard. So even though allowed, not a practical concern, like a unicorn [26-bit float](https://stackoverflow.com/a/51883952/2410359) – chux - Reinstate Monica Dec 23 '18 at 14:35
3

You are correct that unsigned int can only store integers >= 0. (Of course there is an upper limit too, and that upper limit depends on your architecture and is defined as UINT_MAX in limits.h).

By assigning an signed int value to an unsigned int, you are invoking an implicit type conversion. The C language has some very precise rules about how this happens. Whenever possible, the compiler attempts to preserve the value whenever possible. Take this for instance:

int x = 5;
unsigned int y;

y = x;

The above code also does a type conversion, but since the value "5" is representable in both signed and unsigned integer ranges, the value can be preserved, so y will also have a value of 5.

Now consider:

x = -5;
y = x;

Specifically in this case you are assigning a value that is not within the representable range of unsigned int, and therefore the compiler must convert the value to something within range. The C standard dictates that the value 1 + UINT_MAX will be added to the value until it is within range of unsigned int. On most systems these days, UINT_MAX is defined as 4294967925 (2^32 - 1), so the value of y will actually be 4294967921 (or 0xFFFFFFFB in hex).

It is important to note that on twos-complement machines (nearly ubiquitous these days) the binary representations of a signed int value of -5 is also 0xFFFFFFFB, but that is not required. The C standard allows for and supports machines that utilize different integer encodings, therefore portable code should never assume that the binary representation will be preserved after an implicit conversion such as this.

Hope this helps!

Joe Hickey
  • 810
  • 7
  • 8
3

One important point is that overflowing a signed integer is undefined behaviour, whereas unsigned integers are defined to wrap around. In fact that is what is happening when you assign a negative value to one: it simply wraps around until the value is in range.

While this wrap-around behaviour of unsigned types means it is indeed perfectly valid to assign negative values to them, converting them back to signed types is not as well-defined (at best it is implementation-defined, at worst undefined behaviour, depending on how you do it). And while it maybe even be true that on many common platforms the signed and unsigned integers are internally the same, the intended meaning of the value matters for comparisons, conversions (such as to floating point), as well as for compiler optimisation.

In summary, you should use an unsigned type when you need well-defined wrap-around semantics for over- and underflow, and/or you need to represent positive integers greater than the maximum of the corresponding (or largest suitable) signed type. Technically you could avoid signed types in most cases by implementing negative numbers on top of unsigned types (after all, you could simply choose to interpret certain bit patterns as negative numbers), but… why, when the language offers this service "for free". The only real problem with signed integers in C is having to watch out for overflow, but in return you may get better optimisation.

Arkku
  • 41,011
  • 10
  • 62
  • 84
1

Unsigneds have 1) higher maximums and 2) defined, wraparound overflow.

If with infinite precision

 (unxigned_c = unsigned_a + unsinged_b) >= UINT_MAX

then unsigned_c will get reduced modulo UINT_MAX+1:

#include <limits.h>
#include <stdio.h>
int main()
{
    printf("%u\n", UINT_MAX+1); //prints 0
    printf("%u\n", UINT_MAX+2); //prints 1
    printf("%u\n", UINT_MAX+3); //prints 2
}

A a similar thing is happening with you storing signed values into an unsigned. In this case 6.3.1.3p2 applies -- UINT_MAX+1 is conceptually added to the value).

With signed types, on the other hand, overflow is undefined, which means if you allow it to happen, your program is no longer well formed and the standard makes no guarantees about its behavior. Compilers exploit this for optimization by assuming it will never happen.

For example, if you compile

#include <limits.h>
#include <stdio.h>

__attribute__((noinline,noclone)) //or skip the attr & define it in another tu
_Bool a_plus1_gt_b(int a, int b) { return a + 1 > b; }

int main()
{
    printf("%d\n", a_plus1_gt_b(INT_MAX,0)); //0
    printf("%d\n", INT_MAX+1); //1
}

on gcc with -O3, it'll very likely print

1
-2147483648
Petr Skocik
  • 58,047
  • 6
  • 95
  • 142