252

-2147483648 is the smallest integer for integer type with 32 bits, but it seems that it will overflow in the if(...) sentence:

if (-2147483648 > 0)
    std::cout << "true";
else
    std::cout << "false";

This will print true in my testing. However, if we cast -2147483648 to integer, the result will be different:

if (int(-2147483648) > 0)
    std::cout << "true";
else
    std::cout << "false";

This will print false.

I'm confused. Can anyone give an explanation on this?


Update 02-05-2012:

Thanks for your comments, in my compiler, the size of int is 4 bytes. I'm using VC for some simple testing. I've changed the description in my question.

That's a lot of very good replys in this post, AndreyT gave a very detailed explanation on how the compiler will behave on such input, and how this minimum integer was implemented. qPCR4vir on the other hand gave some related "curiosities" and how integers are represented. So impressive!

phuclv
  • 37,963
  • 15
  • 156
  • 475
benyl
  • 2,097
  • 2
  • 13
  • 7
  • 50
    _"we all know that -2147483648 is the smallest number of integer"_ That depends on the size of the integer. – orlp Feb 04 '13 at 20:35
  • 16
    "we all know that -2147483648 is the smallest number of integer" - I thought that there was no smallest integer, since there are infintely many of them... Whatever. –  Feb 04 '13 at 20:35
  • @Inisheer With 4 Byte integers you may have a `INT_MIN` of `-9223372036854775808`, if `CHAR_BIT` is 16. And even with `CHAR_BIT == 8` and `sizeof(int`==4)` you may get `-9223372036854775807` because C do not require 2-Complement numbers. – 12431234123412341234123 Jul 19 '17 at 12:28

4 Answers4

403

-2147483648 is not a "number". C++ language does not support negative literal values.

-2147483648 is actually an expression: a positive literal value 2147483648 with unary - operator in front of it. Value 2147483648 is apparently too large for the positive side of int range on your platform. If type long int had greater range on your platform, the compiler would have to automatically assume that 2147483648 has long int type. (In C++11 the compiler would also have to consider long long int type.) This would make the compiler to evaluate -2147483648 in the domain of larger type and the result would be negative, as one would expect.

However, apparently in your case the range of long int is the same as range of int, and in general there's no integer type with greater range than int on your platform. This formally means that positive constant 2147483648 overflows all available signed integer types, which in turn means that the behavior of your program is undefined. (It is a bit strange that the language specification opts for undefined behavior in such cases, instead of requiring a diagnostic message, but that's the way it is.)

In practice, taking into account that the behavior is undefined, 2147483648 might get interpreted as some implementation-dependent negative value which happens to turn positive after having unary - applied to it. Alternatively, some implementations might decide to attempt using unsigned types to represent the value (for example, in C89/90 compilers were required to use unsigned long int, but not in C99 or C++). Implementations are allowed to do anything, since the behavior is undefined anyway.

As a side note, this is the reason why constants like INT_MIN are typically defined as

#define INT_MIN (-2147483647 - 1)

instead of the seemingly more straightforward

#define INT_MIN -2147483648

The latter would not work as intended.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • 81
    This is also why this is done: `#define INT_MIN (-2147483647 - 1)`. – orlp Feb 04 '13 at 20:39
  • Funny, clang for me seems to run this fine (printing false both times), even though my integer size is only 4 bytes. Try again! – Richard J. Ross III Feb 04 '13 at 20:40
  • 5
    @RichardJ.RossIII - with clang you are probably getting a 64-bit-typed literal, since it was too big to fit in an `int`. OP's implementation may not have a 64-bit type. – Carl Norum Feb 04 '13 at 20:41
  • 1
    @RichardJ.RossIII: I believe this behaviour is implementation-defined/undefined. – Oliver Charlesworth Feb 04 '13 at 20:41
  • The funny thing is that I thought of this, but then decided on the negative sign being included in the literal for some reason. – chris Feb 04 '13 at 20:42
  • 3
    I never thought that a "negative number" isn't parsed as such. I don't see a reason. I hope that `-1.0` is parsed as a negative double value, isn't it? – leemes Feb 04 '13 at 20:43
  • 1
    @OliCharlesworth: I believe not: `5.4 If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined.` This doesn't go for unsigned types as the standard defines it's modulo behaviour earlier. – orlp Feb 04 '13 at 20:47
  • Well, after thinking again, for floats / doubles it doesn't make any difference. They are somewhat "symmetrical", if you know what I mean. The compiler will "optimize" negative numbers anyway. So it's only about the parsing process, not about performance. – leemes Feb 04 '13 at 20:47
  • @nightcracker: But it's not unsigned, it's a `long int` (at least, it would be in C, I can't be bothered to trawl the C++ spec right now... ;) ) – Oliver Charlesworth Feb 04 '13 at 20:54
  • @OliCharlesworth: huh? The sentence I quoted is straight from the C++ spec, and goes for any expression. – orlp Feb 04 '13 at 21:13
  • @nightcracker: I know. But like you said, it's well-defined for unsigned, and undefined for signed types, and the OP's code involves a signed value (`2147483647` is interpreted as a signed (long) (long) int). – Oliver Charlesworth Feb 04 '13 at 21:52
  • @OliCharlesworth: Ah, I read your original comment again. I looked over the undefined behaviour part, and only saw the implementation-defined part. What I meant to say that it is not implementation-defined, but undefined behaviour. – orlp Feb 04 '13 at 21:54
  • 2147483648 is promoted to unsigned int, not long. – qPCR4vir Feb 04 '13 at 22:10
  • 6
    @qPCR4vir: No. As I wrote in my comment to your answer, neither modern C nor C++ allow using unsigned types in this case (with an *unsuffixed decimal constant*). Only the first standard C (C89/90) permitted `unsigned long int` in this context, but in C99 this permission was removed. Unsuffixed literals in C and C++ are required to have *signed* types. If you see unsigned type here when a signed one would work, it means your compiler is broken. If you see unsigned type here when no signed type would work, then this is just a specific manifestation of undefined behavior. – AnT stands with Russia Feb 04 '13 at 23:01
  • @AndreyT `#define INT_MIN (-2147483647 - 1)` comes from `limits.h` (`climits`) so I think you should add a reference. You can probably also mention how `limits.h` write a literal constant of `unsigned int` (e.g. `#define UINT_MAX 0xffffffffU`) @nightcracker – Alvin Wong Feb 05 '13 at 06:36
  • I learn yet another thing about C/C++ ! Thanks! But I wonder: is there a place with a list both concise & precise of most of the "caveats" in C/C++? (and in the end I think it makes C/even more and not less reliable than other langages, which may have similar problems but with less ways to grok them...) The C-faq and abridged version of it from Usenet is a good place to start, but any other pointers? – Olivier Dulac Feb 07 '13 at 12:55
  • I always wondered why they didn't `#define INT_MIN (~INT_MAX)` – Neil Feb 13 '13 at 22:31
  • Recently I learned bit-overflow is undefined behaviour ref:`ISO C section 6.5 paragraph 5`, But doing `i = 2147483648` is undefined **?** yes `i = -2147483648` is valid. – Grijesh Chauhan Jul 09 '13 at 05:56
  • @Neil Because two's complement is not guaranteed (until C++20). – L. F. Jul 12 '19 at 05:37
  • @L.F. But the bit pattern for `INT_MIN` is the same in both one's and two's complement, just its value is different. – Neil Jul 12 '19 at 09:45
  • @Neil Well, how about those other than ones' and two's complement? (BTW, it is ones' complement, not one's complement :) – L. F. Jul 12 '19 at 09:48
  • @L.F. Ah right, there's also sign and magnitude. But doesn't `(-2147483647-1)` only works in twos' (my spellchecker flags that) complement anyway? – Neil Jul 12 '19 at 10:02
  • @Neil Well, on other architectures `INT_MIN` has to be defined another way. (And your spellchecker is right — it's *two's* complement, and *ones'* complement.) – L. F. Jul 12 '19 at 10:04
  • @L.F. Sorry, I'm not following; if they're going to define it differently on other architectures anyway, what's wrong with defining it as `(~INT_MAX)` on two's complement? – Neil Jul 12 '19 at 10:11
  • @Neil Oh, `(~INT_MAX)` is OK on two's complement. That's fine. I thought you wanted to automatically adapt to all architectures, but I was thinking wrong. Sorry for that. – L. F. Jul 12 '19 at 10:13
  • @L.F. I may have thought that at the time... I can't really remember, sorry. – Neil Jul 12 '19 at 10:25
43

The compiler (VC2012) promote to the "minimum" integers that can hold the values. In the first case, signed int (and long int) cannot (before the sign is applied), but unsigned int can: 2147483648 has unsigned int ???? type. In the second you force int from the unsigned.

const bool i= (-2147483648 > 0) ;  //   --> true

warning C4146: unary minus operator applied to unsigned type, result still unsigned

Here are related "curiosities":

const bool b= (-2147483647      > 0) ; //  false
const bool i= (-2147483648      > 0) ; //  true : result still unsigned
const bool c= ( INT_MIN-1       > 0) ; //  true :'-' int constant overflow
const bool f= ( 2147483647      > 0) ; //  true
const bool g= ( 2147483648      > 0) ; //  true
const bool d= ( INT_MAX+1       > 0) ; //  false:'+' int constant overflow
const bool j= ( int(-2147483648)> 0) ; //  false : 
const bool h= ( int(2147483648) > 0) ; //  false
const bool m= (-2147483648L     > 0) ; //  true 
const bool o= (-2147483648LL    > 0) ; //  false

C++11 standard:

2.14.2 Integer literals [lex.icon]

An integer literal is a sequence of digits that has no period or exponent part. An integer literal may have a prefix that specifies its base and a suffix that specifies its type.

The type of an integer literal is the first of the corresponding list in which its value can be represented.

enter image description here

If an integer literal cannot be represented by any type in its list and an extended integer type (3.9.1) can represent its value, it may have that extended integer type. If all of the types in the list for the literal are signed, the extended integer type shall be signed. If all of the types in the list for the literal are unsigned, the extended integer type shall be unsigned. If the list contains both signed and unsigned types, the extended integer type may be signed or unsigned. A program is ill-formed if one of its translation units contains an integer literal that cannot be represented by any of the allowed types.

And these are the promotions rules for integers in the standard.

4.5 Integral promotions [conv.prom]

A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.

uınbɐɥs
  • 7,236
  • 5
  • 26
  • 42
qPCR4vir
  • 3,521
  • 1
  • 22
  • 32
  • 4
    @qPCR4vir: In C89/90 the compilers were supposed to use types `int`, `long int`, `unsigned long int` to represent unsuffixed decimal constants. That was the only language that allowed using unsigned types for unsuffixed decimal constants. In C++98 it was `int` or `long int`. No unsigned types allowed. Neither C (starting from C99) nor C++ permits the compiler to use unsigned types in this context. Your compiler is, of course, free to use unsigned types if none of the signed ones work, but this is still just a specific manifestation of undefined behavior. – AnT stands with Russia Feb 04 '13 at 23:03
  • @AndreyT . Great! Of couse, your rigth. Is VC2012 broken? – qPCR4vir Feb 05 '13 at 01:21
  • @qPCR4vir: AFAIK, VC2012 is not a C++11 compiler yet (is it?), which means that it has to use either `int` or `long int` to represent `2147483648`. Also, AFAIK, in VC2012 both `int` and `long int` are 32-bit types. This means that in VC2012 literal `2147483648` should lead to *undefined behavior*. When the behavior is undefined, the compiler is allowed to do anything. That would mean that VC2012 is not broken. It simply issued a misleading diagnostic message. Instead of telling you that behavior is flat out undefined it decided to use an unsigned type. – AnT stands with Russia Feb 05 '13 at 02:12
  • @AndreyT: Are you saying that compilers are free to emit nasal demons if source code contains an unsuffixed decimal literal which exceeds the maximum value of a signed `long`, and are not required to issue a diagnostic? That would seem broken. – supercat Feb 05 '13 at 03:59
  • Same "warning C4146" in VS2008 and "this decimal constant is unsigned only in ISO C90" in G++ – spyder Feb 05 '13 at 09:08
  • @supercat: Yes, nasal demons are allowed, no diagnostic required. As an approximate rule, diagnostics are not required when it was clear that implementations could offer a sensible extension. And by 1998, supporting literals bigger than `LONG_MAX` was such an obvious extension. – MSalters Feb 06 '13 at 00:06
  • @MSalters: It may be reasonable not to require a diagnostic for a number bigger than a `long` in cases where there's a longer type that can handle it, but allowing nasal demons without a diagnostic for on compilers that don't have such a type seems broken. The fact that a compiler is not required to handle a particular program should not mean that any compiler that accepts the program should be free to emit nasal demons. I would think it would be sensible to say that if a compiler accepts a certain program without complaint, it must have certain behavior. – supercat Feb 06 '13 at 00:14
  • @supercat: It's easy to say that informally, but try to put that in Standardese. In this specific case, C++11 does have the necessary Standardese, "extended integer type (3.9.1)". But addressing it in a generic matter is very, very hard. – MSalters Feb 06 '13 at 09:44
5

In Short, 2147483648 overflows to -2147483648, and (-(-2147483648) > 0) is true.

This is how 2147483648 looks like in binary.

In addition, in the case of signed binary calculations, the most significant bit ("MSB") is the sign bit. This question may help explain why.

Community
  • 1
  • 1
drzymala
  • 2,009
  • 20
  • 26
5

Because -2147483648 is actually 2147483648 with negation (-) applied to it, the number isn't what you'd expect. It is actually the equivalent of this pseudocode: operator -(2147483648)

Now, assuming your compiler has sizeof(int) equal to 4 and CHAR_BIT is defined as 8, that would make 2147483648 overflow the maximum signed value of an integer (2147483647). So what is the maximum plus one? Lets work that out with a 4 bit, 2s compliment integer.

Wait! 8 overflows the integer! What do we do? Use its unsigned representation of 1000 and interpret the bits as a signed integer. This representation leaves us with -8 being applied the 2s complement negation resulting in 8, which, as we all know, is greater than 0.

This is why <limits.h> (and <climits>) commonly define INT_MIN as ((-2147483647) - 1) - so that the maximum signed integer (0x7FFFFFFF) is negated (0x80000001), then decremented (0x80000000).

Cole Tobin
  • 9,206
  • 15
  • 49
  • 74
  • 1
    For a 4 bit number, the two's complement negation of `-8` is still `-8`. – Ben Voigt Aug 15 '18 at 20:16
  • Except that -8 is interpreted as 0-8, not negative 8. And 8 overflows a 4 bit signed int – Cole Tobin Aug 17 '18 at 16:52
  • Consider `-(8)` which in C++ is the same as `-8` -- it is negation applied to a literal, not a negative literal. The literal is `8`, which doesn't fit in a signed 4-bit integer, so it must be unsigned. The pattern is `1000`. So far your answer is correct. The two's complement negation of `1000` in 4 bits is `1000`, it doesn't matter if it is signed or unsigned. Your answer, says "interpret the bits as a signed integer" which makes the value `-8` after the two's complement negation, just as it was before negation. – Ben Voigt Aug 17 '18 at 19:55
  • Of course, in "4-bit C++" there is no "interpret the bits as a signed integer step". The literal becomes the smallest type that can express it, which is *unsigned 4-bit integer*. The value of the literal is `8`. Negation is applied (modulo 16), resulting in a final answer of `8`. The encoding is still 1000 but the value is different because an unsigned type was chosen. – Ben Voigt Aug 17 '18 at 19:58