5
unsigned int x = 4;
int y = -2; 
int z = x > y; 

When realizing this operation the value for the variable Z is 0, but why it is 0 and not 1?.

underscore_d
  • 6,309
  • 3
  • 38
  • 64
J.Doe
  • 77
  • 2

3 Answers3

8

Believe it or not, if a C expression is formed from two arguments, one with an int type and the other with an unsigned type, then the int value is promoted to an unsigned type prior to the comparison taking place.

So in your case, y is promoted to an unsigned type. Because it's negative, it will be converted by having UINT_MAX + 1 added to it, and it will assume a value UINT_MAX - 1.

Therefore x > y will be 0.

Yes, this is the cause of very many bugs. Even professional programmers fall for this from time to time. Some compilers can warn you: for example, with gcc, if you compile with -Wextra you will get a warning.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • Just to add: The fundamental reason for this behavior is, that CPU instruction sets generally do not include commands for mixed sign comparisons. And without hardware support, it would be very tricky/slow to implement mixed sign comparison correctly. Of course, it would be possible for the CPU designers to provide such instructions, but for some reason, they never cared to do so. At least they didn't before C became the de-facto standard for low level programming. And now, since C defines comparison the way it does, mixed sign comparison commands would not be used if they were implemented... – cmaster - reinstate monica Nov 17 '17 at 13:51
  • @cmaster: A more fundamental reason is that different compilers did different things in the days before the Standard, and the authors of the Standard wanted to write a single rule which would come as close as possible to fitting them all. I don't think anyone designing a language from scratch would have written the rules as they are. – supercat Sep 07 '19 at 22:32
  • @supercat Well, from a language perspective, mixed sign comparison would be the natural, single rule choice: Just take both operands for what they are typed to be and compare their logical values. `(unsigned char)128 > (signed char)-1`. Simple, unsurprising, useful. The problem is, that CPUs didn't offer such a comparison, and compilers cannot use what CPUs don't offer. C was designed to be fast on the CPUs of the time, so it didn't do the natural, simple, logical thing, but rather went for what CPUs actually offered, *and added rules specifically to avoid mixed sign comparisons*. – cmaster - reinstate monica Sep 07 '19 at 22:46
  • @cmaster: The problem is that the language specifies the behavior of all dual-operand integer operators other than `<<` and `>>` as converting both operands to a common type, and yielding a result of that type or a result that is always of type `int`. CPU behavior has nothing to do with it. If the Standard had specified that `<` should compare the numerical operands regardless of types, without balancing promotions, that would generally not have been particularly hard for compilers to process reasonably efficiently. Code which didn't need to handle both cases where one operand was... – supercat Sep 07 '19 at 23:10
  • ...outside the range of the other could made to compile more efficiently with the aid of a cast, but on a typical system with 16-bit `int`, generating efficient logically-correct code machine for `int1 > uint1` would be easier than generating equally-efficient machine code for `(long)int1 > (long)uint1`. – supercat Sep 07 '19 at 23:13
  • @supercat The fact that other operators do derive a common type in which the operation is carried out is indeed a good argument. I wouldn't say that I consider it sufficient, though: The comparison operators really are a different beast from the other operators, both mathematically (result is boolean where operands are numbers) and from a hardware perspective (results are available in special CPU flags instead of being placed in registers). And, as the case of `<<` and `>>` shows, operators can be defined to take non-matching operands. So, good, but slightly insufficient argument. – cmaster - reinstate monica Sep 07 '19 at 23:30
  • @cmaster: If CPU efficiency were the driving factor behind operator design, there should be pointer addition and subtraction operators that are measured in terms of bytes rather than target-sized units. Some less common languages such as PROMOL took that approach (I've not used it, but saw a talk about it) and it can greatly improve the efficiency of code on some platforms. For example, on the 68000 if `int` is 16 bits, given `char *cp; int i,*ip;`, if pointers are in address registers and `i` is in a data register, loading `cp[i]` or `(int*)((char*)ip+i)` would be one instruction... – supercat Sep 08 '19 at 17:57
  • ...but loading `ip[i]` would require three: sign-extend `i` to 32 bits, add it to itself, and then use that as a displacement to `ip`. The decision to make the operands balanced is purely a language-spec one, based on the pattern that the only operands which aren't balanced are those whose type is fixed or that are used to index a pointer. Interestingly, even `(int*)((char*)ip+(i+i))` could execute faster than `ip[i]`, since it could skip the sign-extension instruction. – supercat Sep 08 '19 at 18:08
2

This is a result of arithmetic conversions.

In the expression x > y, you have one int operand and one unsigned int operand. y is promoted to unsigned int, and the value converted by adding one more than the maximum unsigned int value to the value of y.

Section 6.3.1.8 of the C standard covering arithmetic conversions states the following:

Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted, without change of type domain, to a type whose corresponding real type is the common real type. Unless explicitly stated otherwise, the common real type is also the corresponding real type of the result, whose type domain is the type domain of the operands if they are the same, and complex otherwise. This pattern is called the usual arithmetic conversions

...

Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.

So now y is a very large value being compared against x which is 4. Since x is smaller than this value, x > y evaluates to false, which has a value of 0.

dbush
  • 205,898
  • 23
  • 218
  • 273
1

It is called integer promotion.

int z = x > y; 

In this example, the comparison(>) operator operates on a signed int and an unsigned int. By the conversion rules, y is converted to an unsigned int. Because -2 cannot be represented as an unsigned int value, the -2 is converted to -2 + UINT_MAX+1.

C11 6.3.1.3, paragraph 2:

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

So, the program prints 0 because UINT_MAX is not less than 4.

Or

If you want to print 1, then do explicit type case. like:

int z =  (int)x >y; 
underscore_d
  • 6,309
  • 3
  • 38
  • 64
msc
  • 33,420
  • 29
  • 119
  • 214