C's comparison's wrong with hierarchy promotion

Question

#include <stdio.h>

int main()
{
    unsigned int x=1;
    char y=-1;
    if (x>y)
    {
        printf("x>y");
    }
    else if(x==y)
        printf("x=y");
    else
        printf("x<y");
    return 0;
}

When I run code above, it does the last else's printf, which is really embarrassing, because x is 1 and y is -1.

I think there's something with the comparison, 'x>y', with hierarchical promotion, cause when I change x's type into 'int', not 'unsigned int', it does just right. This problem is really interesting.. Any answer/thinking/suggestion is welcome.

You would've got warnings when compiling that program....Pay attention to the warnings — Spikatrix, Jun 19 '15 at 12:57
Sorry but within Xcode and Ubuntu's gcc, it got no warning message.. Exactly what warning message are you talking? — Beta, Jun 19 '15 at 13:03

score 6 · Accepted Answer · answered Jun 19 '15 at 13:06

It is actually correct, according to the standard.

Firstly, it is implementation defined whether char is signed or unsigned.

If char is unsigned, the initialisation will use modulo arithmetic, so initialising to -1 will initialise to the maximum value of an unsigned char - which is guaranteed to be greater than 1. The comparison will convert that char to unsigned (which doesn't change the value) before doing the comparison.

If char is signed, the comparison will convert the char with value -1 to be of type unsigned (since x is of type unsigned). That conversion, again, uses modulo arithmetic, except with respect to the unsigned type (so the -1 will convert to the maximum value an unsigned can represent). That results in a value that exceeds 1.

In practice, turning up warning levels on your compiler will trigger warnings on this sort of thing. That is a good idea in practice since the code arguably behaves in a manner that is less than intuitive.

I have one thing more to ask you.. When I put this, "printf("%d\n",(unsigned int)y);" it prints out -1. Is there any reason that compiler does not prints out maximum value of unsigned int? — Beta, Jun 19 '15 at 13:16
`%d` will print an signed `int`, you should use `%u` to print an `unsigned int`. — mch, Jun 19 '15 at 13:20
@beta: technically, the behavior is undefined since the type of the argument (`unsigned int`) doesn't match the what the conversion specifier expects (`int`). `%d` causes the output to be formatted as a signed decimal integer, and it will interpret its corresponding argument as a signed integer. — John Bode, Jun 19 '15 at 13:41

score 2 · Answer 2 · answered Jun 19 '15 at 13:09

2

For the comparison, y is promoted from type char to type unsigned int. However, an unsigned type cannot represent a negative value; instead, that -1 gets interpreted as UINT_MAX, which is most definitely not less than 1.

answered Jun 19 '15 at 13:09

John Bode

119,563
19
122
198

Eric Tsui · Answer 3 · 2015-06-19T13:09:18.817

This is called Type Promotions

The rules, then (which you can also find on page 44 of K&R2, or in section 6.2.1 of the newer ANSI/ISO C Standard) are approximately as follows:

1, First, in most circumstances, values of type char and short int are converted to int right off the bat.

2, If an operation involves two operands, and one of them is of type long double, the other one is converted to long double.

3, If an operation involves two operands, and one of them is of type double, the other one is converted to double.

4, If an operation involves two operands, and one of them is of type float, the other one is converted to float.

5, If an operation involves two operands, and one of them is of type long int, the other one is converted to long int.

6, If an operation involves both signed and unsigned integers, the situation is a bit more complicated. If the unsigned operand is smaller (perhaps we're operating on unsigned int and long int), such that the larger, signed type could represent all values of the smaller, unsigned type, then the unsigned value is converted to the larger, signed type, and the result has the larger, signed type. Otherwise (that is, if the signed type can not represent all values of the unsigned type), both values are converted to a common unsigned type, and the result has that unsigned type.

7, Finally, when a value is assigned to a variable using the assignment operator, it is automatically converted to the type of the variable if (a) both the value and the variable have arithmetic type (that is, integer or floating point), or (b) both the value and the variable are pointers, and one or the other of them is of type void *.

score 1 · Answer 4 · answered Jun 19 '15 at 13:13

According to the C Standard (6.5.8 Relational operators)

3 If both of the operands have arithmetic type, the usual arithmetic conversions are performed.

And further (6.3.1.1 Boolean, characters, and integers, #2)

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.

And at last (6.3.1.8 Usual arithmetic conversions)

Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands

...

Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

Thus in the expression

x > y

character y is promoted to type int. As unsigned int (that corresponds to x) and int have the same rank then according to the last quote y is interpretated as unsigned int. All its bits are set and it corresponds to the maximum value that can be stored in type unsigned int. Thus you have

UINT_MAX > 1
^^^^^^^^  ^^^
   y       x

score 0 · Answer 5 · answered Jun 19 '15 at 13:03

Just run this:

int main()
{
    unsigned int x=1;
    char y=-1;
    printf("x : %#010x\n", x);
    printf("y : %#010x\n", y);
    return 0;
}

which will output the hexa values of your variables:

x : 0x00000001
y : 0xffffffff

Do I need to go any further...?

score 0 · Answer 6 · answered Jun 19 '15 at 13:05

The problem is comparing a signed type with an unsigned type. signed variable such as char y are generally stored using one bit for the sign and the 2-bit complement of the value when negative. Thus, char y = -1; gives you a y with a general representation of :

 vvvvvvv value  : 1111111
11111111
^ sign : negative

2-bit complement: invert all bits and add one = (0000000 + 1) = 1

Meaning your comparison does if (binary 1 > binary 11111111)

score 0 · Answer 7 · edited May 23 '17 at 11:51

The C++ language compiler tries to promote the types if there is no exact fit (as in our example: obviously a char is not an unsigned int). Unfortunately, the direction of such a promotion goes from less precise type to a more precise one, not vice versa. It means, that any char can be promoted to int but no int can be promoted to char.

Expect that compiler will inform you that it is unable to find the best candidate and the compilations will fail.

Negative numbers are represented with the rules governing the so-called two's complement numbers. To represent the char = -1 invert all bits and add one :

Now when the promotion occurs the char is implicitly promoted to 4 bytes. Its the same procedure like trying to express -1 in a 4 bytes int:

0000 0000 0000 0001 
1111 1111 1111 1110 
+                 1 
1111 1111 1111 1111

This value is now treated as unsigned, as in the code below:

int main(){
    int y = -1;
    cout << y << endl;             //if x is an int: x > y
    cout << unsigned(y) << endl;   //if x is an unsigned int y is now treated as UINT_MAX: x < y 
    return 0;
}

which prints:

-1
4294967295

Thus, the evaluation of x < y will be true.

Here are some further details:

C's comparison's wrong with hierarchy promotion

7 Answers7