255

I have below a simple program:

#include <stdio.h>

#define INT32_MIN        (-0x80000000)

int main(void) 
{
    long long bal = 0;

    if(bal < INT32_MIN )
    {
        printf("Failed!!!");
    }
    else
    {
        printf("Success!!!");
    }
    return 0;
}

The condition if(bal < INT32_MIN ) is always true. How is it possible?

It works fine if I change the macro to:

#define INT32_MIN        (-2147483648L)

Can anyone point out the issue?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Jayesh Bhoi
  • 24,694
  • 15
  • 58
  • 73
  • 3
    How much is `CHAR_BIT * sizeof(int)`? – 5gon12eder Dec 09 '15 at 15:37
  • 1
    Have you tried printing out bal? – Ryan Fitzpatrick Dec 09 '15 at 15:39
  • 11
    IMHO the more interesting thing is that it is true *only* for `-0x80000000`, but false for `-0x80000000L`, `-2147483648` and `-2147483648L` (gcc 4.1.2), so the question is: why is the int literal `-0x80000000` different from the int literal `-2147483648`? – Andreas Fester Dec 09 '15 at 15:41
  • 2
    @Bathsheba I just running program on online compiler http://www.tutorialspoint.com/codingground.htm – Jayesh Bhoi Dec 09 '15 at 15:42
  • 2
    If you've ever noticed that (some incarnations of) `` defines `INT_MIN` as `(-2147483647 - 1)`, now you know why. – zwol Dec 09 '15 at 18:51
  • similar: [Casting minimum 32-bit integer (-2147483648) to float gives positive number (2147483648.0)](http://stackoverflow.com/q/11536389/995714), [Why it is different between -2147483648 and (int)-2147483648](http://stackoverflow.com/q/12620753/995714), [large negative integer literals](http://stackoverflow.com/q/8511598/995714) – phuclv Dec 10 '15 at 04:58
  • Modern compilers warn about `-0x80000000` – M.M Dec 10 '15 at 05:43
  • @LưuVĩnhPhúc On a standard 32 bit system that supposed duplicate will print "success" just as expected. As opposed to this question which will print "failed", which was not expected. The difference is that the post you linked uses a signed long long literal, instead of an unsigned int literal. – Lundin Dec 11 '15 at 07:26
  • @LưuVĩnhPhúc I think C++ is actually far behind C here, since C had a long long type which guaranteed at least 64 bits back in 1999, but C++ doesn't seem have gotten one until 2011? Meaning that C and C++ would have displayed different results until C++11. – Lundin Dec 11 '15 at 07:38
  • @Lundin The *first* edition of the C++ standard (C++1998) was published only one year before the *second* edition of the C standard (C1999). I guess WG21 saw no particular hurry about putting out a second edition of C++, despite the significant revisions in C99 that it would have been nice to have uplifted. Looking back from 2015, that was probably a mistake. – zwol Dec 11 '15 at 15:32
  • [Why is 1 not greater than -0x80000000](http://stackoverflow.com/q/27052602/995714) – phuclv Dec 12 '15 at 09:39

6 Answers6

367

This is quite subtle.

Every integer literal in your program has a type. Which type it has is regulated by a table in 6.4.4.1:

Suffix      Decimal Constant    Octal or Hexadecimal Constant

none        int                 int
            long int            unsigned int
            long long int       long int
                                unsigned long int
                                long long int
                                unsigned long long int

If a literal number can't fit inside the default int type, it will attempt the next larger type as indicated in the above table. So for regular decimal integer literals it goes like:

  • Try int
  • If it can't fit, try long
  • If it can't fit, try long long.

Hex literals behave differently though! If the literal can't fit inside a signed type like int, it will first try unsigned int before moving on to trying larger types. See the difference in the above table.

So on a 32 bit system, your literal 0x80000000 is of type unsigned int.

This means that you can apply the unary - operator on the literal without invoking implementation-defined behavior, as you otherwise would when overflowing a signed integer. Instead, you will get the value 0x80000000, a positive value.

bal < INT32_MIN invokes the usual arithmetic conversions and the result of the expression 0x80000000 is promoted from unsigned int to long long. The value 0x80000000 is preserved and 0 is less than 0x80000000, hence the result.

When you replace the literal with 2147483648L you use decimal notation and therefore the compiler doesn't pick unsigned int, but rather tries to fit it inside a long. Also the L suffix says that you want a long if possible. The L suffix actually has similar rules if you continue to read the mentioned table in 6.4.4.1: if the number doesn't fit inside the requested long, which it doesn't in the 32 bit case, the compiler will give you a long long where it will fit just fine.

2501
  • 25,460
  • 4
  • 47
  • 87
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 3
    "... replace the literal with -2147483648L you explicitly get a long, which is signed." Hmmm, In a 32-bit `long` system `2147483648L`, will not fit in a `long`, so it becomes `long long`, _then_ the `-` is applied - or so I thought. – chux - Reinstate Monica Dec 09 '15 at 16:05
  • why cant `0x80000000` fit into an `int` on a 32 bit system?? – A.S.H Dec 09 '15 at 16:07
  • 2
    @A.S.H Because the maximum number an int can have is then `0x7FFFFFFF`. Try it yourself: `#include printf("%X\n", INT_MAX);` – Lundin Dec 09 '15 at 16:15
  • I know, it is the maximum *positive* number. the question is, when you specify a number in hex, does it *need* to be positive? – A.S.H Dec 09 '15 at 16:17
  • 5
    @A.S.H Don't confuse hexadecimal representation of integer literals in source code with the underlying binary representation of a signed number. The literal `0x7FFFFFFF` when written in source code is always a positive number, but your `int` variable can of course contain raw binary numbers up to value 0xFFFFFFFF. – Lundin Dec 09 '15 at 16:22
  • Sorry I am still confused. `ìnt n = 0xFFFFFFFF; cout << n;` displays `-1`. Also `ìnt n = 0x80000000; cout << n;` displays `-2147483648`. I question the statement *"can't fit inside a signed type like int"*. It probably needs further digging or be stated differently. – A.S.H Dec 09 '15 at 16:32
  • 2
    @A.S.H `ìnt n = 0x80000000` forces a conversion from the unsigned literal to a signed type. What will happen is up to your compiler - it is implementation-defined behavior. In this case it chose to show the whole literal into the `int`, overwriting the sign bit. On other systems it might not be possible to represent the type and you invoke undefined behavior - the program might crash. You'll get the very same behaviour if you do `int n=2147483648;` so it is not related to the hex notation at all. – Lundin Dec 09 '15 at 16:52
  • 1
    That's also why you'll find code like this in the [standard C headers](http://repo.or.cz/glibc.git/blob/HEAD:/include/limits.h): `#define INT_MIN (-INT_MAX - 1)` – nwellnhof Dec 09 '15 at 21:09
  • 2
    The behavior of "wrap around" unsigned numbers is fixed by the C++ standard. It has nothing to do with 2's complement (and only to do with the `sizeof(unsigned)`). Are you sure it is different in C? – Yakk - Adam Nevraumont Dec 09 '15 at 23:21
  • 2
    "Instead, on a two's complement system," - actually the system of representing negative numbers does not affect unsigned arithmetic, which is defined in terms of modular arithmetic. In the case of 32-bit int, `-0x80000000` is always `0x80000000`. – M.M Dec 10 '15 at 05:26
  • @Lundin out-of-range assignment from integer type to signed integer type is always *implementation-defined*; there are no UB cases – M.M Dec 10 '15 at 05:44
  • @M.M I believe the standard says something about an "implementation-defined signal may be raised". What that signal is or what happens if it isn't handled, is not covered by the standard. But sure, I can edit that part. – Lundin Dec 10 '15 at 07:15
  • The behaviour of signals is part of the Standard; the default handling of each signal is *implementation-defined* too (7.14/4) – M.M Dec 10 '15 at 07:18
  • I am surprised that this overly complicated explanation is so popular. It turns out that the comparison (<) has nothing to do with it really, your last two paragraphs seem to be completely irrelevant. Just try to output the value INT32_MIN to see how it is represented. – Octopus Dec 10 '15 at 19:56
  • 1
    @Octopus The paragraph about implicit promotion is relevant: suppose long is 32 bits, and we have a nearly identical example where the other operand is a long with any randomly picked value. Then the usual arithmetic conversions would instead have forced that operand to convert to unsigned, and the expression would have been evaluated in a completely different manner. As for the last paragraph, it answers the question. – Lundin Dec 11 '15 at 08:01
  • 3
    The explanation of how unary `-` is applied to unsigned integers could be expanded a bit. I had always assumed (though fortunately never relied on the assumption) that unsigned values would be "promoted" to signed values, or possibly that the result would be undefined. (Honestly, it should be a compile-error; what does `- 3u` even mean?) – Kyle Strand Dec 11 '15 at 20:04
29

0x80000000 is an unsigned literal with value 2147483648.

Applying the unary minus on this still gives you an unsigned type with a non-zero value. (In fact, for a non-zero value x, the value you end up with is UINT_MAX - x + 1.)

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • In this case, `-0x80000000` is `0x80000000`, unsigned, since UINT_MAX+1 is `0xFFFFFFFF+1` = `1ULL<<32`. (Or actually `0` since UINT_MAX+1 wraps to 0 if you evaluated that expression according to C rules after re-arranging to `UINT_MAX+1 - x`, since addition is associative when signed-overflow UB isn't a factor.) Fun fact: signed `-INT_MIN` causes signed-overflow UB, unlike any other `int` value. The most-negative number is its own complement in 2's complement systems. – Peter Cordes Sep 01 '23 at 07:25
23

This integer literal 0x80000000 has type unsigned int.

According to the C Standard (6.4.4.1 Integer constants)

5 The type of an integer constant is the first of the corresponding list in which its value can be represented.

And this integer constant can be represented by the type of unsigned int.

So this expression

-0x80000000 has the same unsigned int type. Moreover it has the same value 0x80000000 in the two's complement representation that calculates the following way

-0x80000000 = ~0x80000000 + 1 => 0x7FFFFFFF + 1 => 0x80000000

This has a side effect if to write for example

int x = INT_MIN;
x = abs( x );

The result will be again INT_MIN.

Thus in in this condition

bal < INT32_MIN

there is compared 0 with unsigned value 0x80000000 converted to type long long int according to the rules of the usual arithmetic conversions.

It is evident that 0 is less than 0x80000000.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
13

A point of confusion occurs in thinking the - is part of the numeric constant.

In the below code 0x80000000 is the numeric constant. Its type is determine only on that. The - is applied afterward and does not change the type.

#define INT32_MIN        (-0x80000000)
long long bal = 0;
if (bal < INT32_MIN )

Raw unadorned numeric constants are positive.

If it is decimal, then the type assigned is first type that will hold it: int, long, long long.

If the constant is octal or hexadecimal, it gets the first type that holds it: int, unsigned, long, unsigned long, long long, unsigned long long.

0x80000000, on OP's system gets the type of unsigned or unsigned long. Either way, it is some unsigned type.

-0x80000000 is also some non-zero value and being some unsigned type, it is greater than 0. When code compares that to a long long, the values are not changed on the 2 sides of the compare, so 0 < INT32_MIN is true.


An alternate definition avoids this curious behavior

#define INT32_MIN        (-2147483647 - 1)

Let us walk in fantasy land for a while where int and unsigned are 48-bit.

Then 0x80000000 fits in int and so is the type int. -0x80000000 is then a negative number and the result of the print out is different.

[Back to real-word]

Since 0x80000000 fits in some unsigned type before a signed type as it is just larger than some_signed_MAX yet within some_unsigned_MAX, it is some unsigned type.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
12

The numeric constant 0x80000000 is of type unsigned int. If we take -0x80000000 and do 2s compliment math on it, we get this:

~0x80000000 = 0x7FFFFFFF
0x7FFFFFFF + 1 = 0x80000000

So -0x80000000 == 0x80000000. And comparing (0 < 0x80000000) (since 0x80000000 is unsigned) is true.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • This supposes 32-bit `int`s. Although that's a very common choice, in any given implementation `int` might be either narrower or wider. It is a correct analysis for that case, however. – John Bollinger Dec 09 '15 at 15:53
  • This isn't relevant to OP's code, `-0x80000000` is unsigned arithmetic. `~0x800000000` is different code. – M.M Dec 10 '15 at 05:51
  • This seems to be the best and correct answer to me simply put. @M.M. he is explaining how to take a twos complement. This answer specifically addresses what the negative sign is doing to the number. – Octopus Dec 10 '15 at 20:06
  • @Octopus the negative sign is *not* applying 2's complement to the number (!) Although this seems clear, it's not describing what happens in the code `-0x80000000` ! In fact 2's complement is irrelevant to this question entirely. – M.M Dec 10 '15 at 20:12
8

C has a rule that the integer literal may be signed or unsigned depends on whether it fits in signed or unsigned (integer promotion). On a 32-bit machine the literal 0x80000000 will be unsigned. 2's complement of -0x80000000 is 0x80000000 on a 32-bit machine. Therefore, the comparison bal < INT32_MIN is between signed and unsigned and before comparison as per the C rule unsigned int will be converted to long long.

C11: 6.3.1.8/1:

[...] Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.

Therefore, bal < INT32_MIN is always true.

Community
  • 1
  • 1
haccks
  • 104,019
  • 25
  • 176
  • 264