Why it is different between -2147483648 and (int)-2147483648

Question

When I run the following code under Windows7 x64, compiled with GCC of MinGW, the result seems to be underflowed:

cout<<-2147483648 ;    //Output: 2147483648

but when I assigned it to a integer variable, or just simply convert it to the int type :

cout<<(int)-2147483648 ; //Output: -2147483648

So, what's wrong with the previous version of my code? Istn't it the type int? or what the lower bound the Integer is exactly? Many thanks.

Looks like `clang` and `g++` don't agree on this: http://nopaste.info/fea6d05c3c.html — akappa, Sep 27 '12 at 12:05
[Why is -(-2147483648) = - 2147483648 in a 32-bit machine?](https://stackoverflow.com/q/42462352/995714), [Casting minimum 32-bit integer (-2147483648) to float gives positive number (2147483648.0)](https://stackoverflow.com/q/11536389/995714), [Why is 0 < -0x80000000?](https://stackoverflow.com/q/34182672/995714) — phuclv, Aug 15 '21 at 08:48

score 11 · Accepted Answer · 2012-09-27T12:10:52.467

11

2147483648 doesn't fit into an int or a long on your system, so it's treated as a constant of type unsigned long. (Edit: as ouah pointed out in the comments, it's undefined behaviour in standard C++, but your compiler accepts it as an extension.) Negating an unsigned integer value is possible, but results in another unsigned integer value, never a negative number. Negating 2147483648UL produces 2147483648UL (assuming, as is the case on your system, that unsigned long is a 32 bit type).

Casting that to int produces an implementation-defined result, commonly the result you see, but not necessarily. You can get the result you want without any conversions by writing -2147483647 - 1.

edited Sep 27 '12 at 12:10

answered Sep 27 '12 at 11:57

2

`2147483648` is a constant of type `long long` not `unsigned long`. An unsuffixed decimal constant is of type `int`, `long` or `long long`. – ouah Sep 27 '12 at 12:02
C++ tag ok. This is the case for C99 and C11 in C. – ouah Sep 27 '12 at 12:04
@ouah Good point, this question (based on the use of `cout`) clearly assumes C++, so I'll remove the C tag. – Sep 27 '12 at 12:06
And so why `g++` and `clang++` don't agree on this? http://nopaste.info/fea6d05c3c.html – akappa Sep 27 '12 at 12:06
2

For C++, if we take C++98 it is undefined behavior in 32-bit systems: *(2.13.1p2) "If it is decimal and has no suffix, it has the first of these types in which its value can be represented: int, long int; if the value cannot be represented as a long int, the behavior is undefined."* – ouah Sep 27 '12 at 12:08
@hvd: `clang` gives signed integers when invoked with both `--std=c++03` (which must be the default) and `--std=c++11`. – akappa Sep 27 '12 at 12:10
@akappa Yes, ouah rightly points out that the behaviour is undefined, so either behaviour is acceptable. – Sep 27 '12 at 12:11
Yeah, I agree on that. So `clang++` is representing it in a bigger signed type, while `g++` is representing it in an `unsigned` one (and it is complaining that his result is "guaranteed" only in C90, in a rather criptic warning message) – akappa Sep 27 '12 at 12:12
@akappa: that's because in C99 it's `signed long long`, whereas in C90 it's `unsigned long`. So it's not cryptic if you already know that there's shenanigans going on with large-ish literals. – Steve Jessop Sep 27 '12 at 12:15
@SteveJessop: but this is `C++`! Why he's talking about `C`, which is an entirely different language? – akappa Sep 27 '12 at 12:17
1

@akappa: ah yes, fair point. OK, so it's because in C++ it's undefined behavior whereas in C90 it's `unsigned long`. So it's not cryptic if you already know that there's shenanigans going on with large-ish literals, and that `g++`'s version of those shenanigans is based on C90 ;-) If g++'s behavior was for some reason based on Python it might say that it was only guaranteed in Python, but you're more likely to share a header file with a literal in it between C and C++ than between Python and C++, so it's not a total non-sequitur to mention C. – Steve Jessop Sep 27 '12 at 12:21
1

For the record, C and C++ behaves exactly the same in the latest C11/C++11 standards. C11 states that if the constant literal has no suffix and it is in decimal format, it can only be int, long or long long. Hexadecimal literals without suffix can however be unsigned. See C11 6.4.4.1. – Lundin Sep 27 '12 at 12:49
@Lundin: then `g++` is still non-compliant when invoked with `--std=c++0x`. – akappa Sep 27 '12 at 13:00
@akappa: yes, and there's no obvious entry for implementing that part of C++11 in the list here: http://gcc.gnu.org/projects/cxx0x.html – Steve Jessop Sep 27 '12 at 13:08
I would have expected it to be covered by the `long long` entry, but that's marked as being available as of GCC 4.3. – Sep 27 '12 at 13:09
@hvd: that's what I expected too until I ran the code. `12147483648` has type `long long` as required. So literals between `LONG_MAX+1` and `ULONG_MAX` remain a disaster-zone as far as gcc is concerned... – Steve Jessop Sep 27 '12 at 13:15
1

When I get the chance, I'll test in GCC 4.7, and if it fails there, and no one has done so earlier, I'll report it as a bug. – Sep 27 '12 at 13:20
1

@akappa @SteveJessop With MinGW's GCC 4.7.2, `int main() { if (-2147483648 >= 0) __builtin_abort (); return 0; }` aborts with `-std=c++98` or `-std=c++03`, and runs without any problems with `-std=c++11`. So it has been fixed already. – Sep 27 '12 at 19:15

Mike Seymour · Answer 2 · 2012-09-27T13:17:15.050

So, what's wrong with the previous version of my code?

Presumably, you're using a pre-2011 compiler, and on your system long has 32 bits. The value (-2³¹) isn't guaranteed to fit into long, so it might overflow. That gives undefined behaviour, so you could see anything.

The most likely explanation for the particular value you see (2³¹) is that, in the absence of defined behaviour in C++, your compiler is using the old C90 rules, and converting the value to unsigned long.

Istn't it the type int?

Before 2011, it was int if the value is representable by int, otherwise long, with undefined behaviour if that isn't sufficient. C++11 adds the long long type, and allows that to be used for integer literals if long isn't big enough.

or what the lower bound the Integer is exactly?

Signed integer types with N bits have a range of at least -2^(N-1)+1 to 2^(N-1)-1. Your value is -2³¹, which is just out of range for a 32-bit signed type.

The language doesn't specify the exact size of the integer types; just that int must have at least 16 bits, long at least 32, and (since 2011) long long at least 64.

The comments, deletions, and edits get confusing. Hopefully this one's right: the value 2^31 isn't guaranteed to fit into `long`, but if it does, so is -2^31. If the value 2^31 doesn't fit, the behaviour is undefined, so the conclusion is the same. — , Sep 27 '12 at 13:07
If `long` is guaranteed to have 32 bits by the standard, 2^31 will fit into it. The only reason it couldn’t fit is that the standard is not observed for compilation (either because the compiler is incapable or just not doing it). — Keno, Feb 26 '17 at 04:03

score 2 · Answer 3 · edited May 23 '17 at 11:53

2

First of all, it is important to understand that there are no negative integer literals.

Others have explained why the OP's particular compiler behaves as it does. But for the record, this is what the compiler should do, between the lines, on a 32-bit system:

You have the number 2147483648, which cannot fit in a 32-bit signed int of two's complement format.
Since it is a decimal number (without an U, L or similar suffix), the compiler checks its internal type table (1) for such an integer constant. It works like this: try to fit it in an int, if it doesn't fit, try a long, if it doesn't fit there either, try a long long, if it doesn't fit there either, we have undefined behavior. A C or C++ compiler following the latest standard will not attempt to fit it in unsigned types.
In this specific case, the number doesn't fit in an int nor in a long, so the compiler decides to use a long long as type for the literal.
You then use the unary minus operator on this literal, ending up with the number -2147483648. Ironically this would fit in a signed int of two's complement format, but it is too late to change the type, the compiler has already picked long long as the type.

(1) This "internal table" looks different if you have an unsigned suffix, or if you have hex format etc. If there is an unsigned suffix, it will only check if the number fits in unsigned numbers. If there is hex notation (but no suffix), it will check int, then unsigned int, then long and so on.

edited May 23 '17 at 11:53

Community

1
1

answered Sep 27 '12 at 13:10

Lundin

195,001
40
254
396

"A C or C++ compiler following the latest standard will not attempt to fit it in unsigned types." -- If the behaviour is undefined, it may attempt to fit in any other type, unsigned or not, that it likes. "In this specific case, the number doesn't fit in an int so the compiler decides to use a long as type for the literal." -- The literal fits neither into an `int` nor into `long`. – Sep 27 '12 at 13:13
@hvd: In C++11 at least, it can't choose an unsigned type: "If all of the types in the list for the literal are signed, the extended integer type shall be signed." – Mike Seymour Sep 27 '12 at 13:15
@MikeSeymour `unsigned long` isn't picked as an extended integer type. GCC doesn't implement any extended integer types. The last sentence applies: "A program is ill-formed if one of its translation units contains an integer literal that cannot be represented by any of the allowed types." and an ill-formed program may be accepted, with no standard-mandated behaviour, so long as a diagnostic is issued. (Obviously, the specific constant in this question will always fit into `long long`, but to apply the C++11 rules to the question, we need to pretend the constant is larger.) – Sep 27 '12 at 13:16
@hvd Yeah my bad, thanks for pointing that out. Post updated, I've changed long with long long where applicable. – Lundin Sep 27 '12 at 14:13
1

@hvd Regarding undefined behavior: it will only attempt other types once it is done iterating through its internal table of allowed types. So if it doesn't fit in int, long or long long, then and only then the UB kicks in and it might very well start to check for unsigned types (or crash & burn) at that point. But note the difference with for example hex literals: they behave as the old C90 way: check int, then unsigned int, then long, then unsigned long and so on. A plain signed literal will not check for unsigned numbers in between the signed ones. – Lundin Sep 27 '12 at 14:16
@Lundin Indeed, when the compiler is invoked in C++98/C++03 mode, there is no signed integer type larger than long, and unsigned long can be tried after long. When the compiler is invoked in C++11 mode, there is also long long which must be checked first. – Sep 27 '12 at 17:51

score 0 · Answer 4 · answered Oct 02 '12 at 04:42

0

Actually I found an explaination from a pdf file of CS:APP which perfectly give the solution, you can download it from here. http://www.csapp.cs.cmu.edu/public/waside/waside-tmin.pdf

answered Oct 02 '12 at 04:42

lichenbo

1,019
11
13

Why it is different between -2147483648 and (int)-2147483648

4 Answers4

Linked