3

I have this simple C program.

#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>

bool foo (unsigned int a) {
    return (a > -2L);
}

bool bar (unsigned long a) {
    return (a > -2L);
}

int main() {
    printf("foo returned = %d\n", foo(99));
    printf("bar returned = %d\n", bar(99));
    return 0;
}

Output when I run this -

foo returned = 1
bar returned = 0

Recreated in godbolt here

My question is why does foo(99) return true but bar(99) return false.

To me it makes sense that bar would return false. For simplicity lets say longs are 8 bits, then (using twos complement for signed value):

99 == 0110 0011
-2 == unsigned 254 == 1111 1110

So clearly the CMP instruction will see that 1111 1110 is bigger and return false.

But I dont understand what is going on behind the scenes in the foo function. The assembly for foo seems to hardcode to always return mov eax,0x1. I would have expected foo to do something similar to bar. What is going on here?

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
Sourav Ganguly
  • 329
  • 2
  • 14
  • 2
    Please pick *one* language to ask about. While C and C++ might seem similar they are really two very different languages with different behavior in many cases. – Some programmer dude Jan 09 '22 at 14:50
  • Remember that you compare to the **`long`** value `-2` which for `foo` means that the `unsigned int` variable `a` will be converted to a plain `long`. Arithmetic operations (and comparison is really an arithmetic operation) is done using the larger type. – Some programmer dude Jan 09 '22 at 14:53
  • 3
    @Someprogrammerdude: `unsigned int` is only converted to a `long` if `long` can represent all the values of `unsigned int`. This does appear to have happened in OP’s case. If `long` is the same width as `unsigned int`, then the last rule of the usual arithmetic conversions applies, which says that both operands are converted to the unsigned type corresponding to the type of the signed integer operand, which would be `unsigned long`. – Eric Postpischil Jan 09 '22 at 15:03
  • 1
    There are no negative literals – Language Lawyer Jan 09 '22 at 15:52

4 Answers4

4

This is covered in C classes and is specified in the documentation. Here is how you use documents to figure this out.

In the 2018 C standard, you can look up > or “relational expressions” in the index to see they are discussed on pages 68-69. On page 68, you will find clause 6.5.8, which covers relational operators, including >. Reading it, paragraph 3 says:

If both of the operands have arithmetic type, the usual arithmetic conversions are performed.

“Usual arithmetic conversions” is listed in the index as defined on page 39. Page 39 has clause 6.3.1.8, “Usual arithmetic conversions.” This clause explains that operands of arithmetic types are converted to a common type, and it gives rules determining the common type. For two integer types of different signedness, such as the unsigned long and the long int in bar (a and -2L), it says that, if the unsigned type has rank greater than or equal to the rank of the other type, the signed type is converted to the unsigned type.

“Rank” is not in the index, but you can search the document to find it is discussed in clause 6.3.1.1, where it tells you the rank of long int is greater than the rank of int, and the any unsigned type has the same rank as the corresponding type.

Now you can consider a > -2L in bar, where a is unsigned long. Here we have an unsigned long compared with a long. They have the same rank, so -2L is converted to unsigned long. Conversion of a signed integer to unsigned is discussed in clause 6.3.1.3. It says the value is converted by wrapping it modulo ULONG_MAX+1, so converting the signed long −2 produces a ULONG_MAX+1−2 = ULONG_MAX−1, which is a large integer. Then comparing a, which has the value 99, to a large integer with > yields false, so zero is returned.

For foo, we continue with the rules for the usual arithmetic conversions. When the unsigned type does not have rank greater than or equal to the rank of the signed type, but the signed type can represent all the values of the type of the operand with unsigned type, the operand with the unsigned type is converted to the operand of the signed type. In foo, a is unsigned int and -2L is long int. Presumably in your C implementation, long int is 64 bits, so it can represent all the values of a 32-bit unsigned int. So this rule applies, and a is converted to long int. This does not change the value. So the original value of a, 99, is compared to −2 with >, and this yields true, so one is returned.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
2

In the first function

bool foo (unsigned int a) {
    return (a > -2L);
}

the both operands of the expression a > -2L have the type long (the first operand is converted to the type long due to the usual arithmetic conversions because the rank of the type long is greater than the rank of the type unsigned int and all values of the type unsigned int in the used system can be represented by the type long). And it is evident that the positive value 99L is greater than the negative value -2L.

The first function could produce the result 0 provided that sizeof( long ) is equal to sizeof( unsigned int ). In this case the type long is unable to represent all (positive) values of the type unsigned int. As a result due to the usual arithmetic conversions the both operands will be converted to the type unsigned long.

For example running the function foo using MS VS 2019 where sizeof( long ) is equal to 4 as sizeof( unsigned int ) you will get the result 0.

Here is a demonstration program written in C++ that visually shows the reason why the result of a call of the function foo using MS VS 2019 can be equal to 0.

#include <iostream>
#include <iomanip>
#include <type_traits>

int main()
{
    unsigned int x = 0;
    long y = 0;

    std::cout << "sizeof( unsigned int ) = " << sizeof( unsigned int ) << '\n';
    std::cout << "sizeof( long ) = " << sizeof(long) << '\n';

    std::cout << "std::is_same_v<decltype( x + y ), unsigned long> is "
              << std::boolalpha
              << std::is_same_v<decltype( x + y ), unsigned long>
              << '\n';
}

The program output is

sizeof( unsigned int ) = 4
sizeof( long ) = 4
std::is_same_v<decltype( x + y ), unsigned long> is true

That is in general the result of the first function is implementation defined.

In the second functions

bool bar (unsigned long a) {
    return (a > -2L);
}

the both operands have the type unsigned long (again due to the usual arithmetic conversions and ranks of the types unsigned long and signed long are equal each other, so an object of the type signed long is converted to the type unsigned long) and -2L interpreted as unsigned long is greater than 99.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
2

The reason for this has to do with the rules of integer conversions.

In the first case, you compare an unsigned int with a long using the > operator, and in the second case you compare a unsigned long with a long.

These operands must first be converted to a common type using the usual arithmetic conversions. These are spelled out in section 6.3.1.8p1 of the C standard, with the following excerpt focusing on integer conversions:

If both operands have the same type, then no further conversion is needed.

Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.

Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.

Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.

Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

In the case of comparing an unsigned int with a long the second bolded paragraph applies. long has higher rank and (assuming long is 64 bit and int is 32 bit) can hold all values than an unsigned int can, so the unsigned int operand a is converted to a long. Since the value in question is in the range of long, section 6.3.1.3p1 dictates how the conversion happens:

When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged

So the value is preserved and we're left with 99 > -2 which is true.

In the case of comparing an unsigned long with a long, the first bolded paragraph applies. Both types are of the same rank with different signs, so the long constant -2L is converted to unsigned long. -2 is outside the range of an unsigned long so a value conversion must happen. This conversion is specified in section 6.3.1.3p2:

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

So the long value -2 will be converted to the unsigned long value 264-2, assuming unsigned long is 64 bit. So we're left with 99 > 264-2, which is false.

dbush
  • 205,898
  • 23
  • 218
  • 273
0

I think what is happening here is implicit promotion by the compiler. When you perform comparison on two different primitives, the compiler will promote one of them to the same type as the other. I believe the rules are that the type with the larger possible value is used as the standard. So in foo() you are implicitly promoting your argument to a signed long type and the comparison works as expected. In bar() your argument is an unsigned long, which has a larger maximum value than signed long. Here the compiler promotes -2L to unsigned long, which turns into a very large number.

akatz
  • 106
  • 6
  • 2
    `the compiler promotes -2L to unsigned long` There are no promotions to `unsigned long`. It's an arithmetic conversion. – eerorika Jan 09 '22 at 14:57