10

The following C/C++ code:

    long long foo = -9223372036854775808LL; // -2^63

compiles (g++) with the warning

integer constant is so large that it is unsigned.

clang++ gives a similar warning.

Thanks to this bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52661. I now understand why GCC gives this warning. Unfortunately, the response to the bug report didn't explain the reason for this behaviour very well.

Questions:

  • Why is no warning given for the equivalent code for a 32/16/8-bit signed integer constant?
  • GCC and Clang both give this warning, so it is clearly intentional behaviour and not just 'to make it easier to parse,' as is suggested in response to the bug report. Why?
  • Is this behaviour mandated by the C/C++ standard? Some other standard?
jezza
  • 331
  • 2
  • 13
  • @AndrewHenle It's never-the-less a strange warning for decimal constants, since `0x8000000000000000LL` doesn't generate the warning, but it too is so large that it is unsigned. But in this case the standard got it covered by well-defined behavior, even though it isn't obvious. – Lundin Nov 25 '20 at 16:05
  • @phuclv Yes this is definitely a duplicate, but lets chill for a while and see if this post should be closed or the one you linked should be closed as dupe to this one. In particular, the linked post doesn't give any reference to the actual standard(s), only cppreference.com. Also, I think this question here was clearer than the one in the dupe. – Lundin Nov 25 '20 at 16:14
  • @phuclv Thank you, I had seen that post but obviously didn't read it carefully enough. – jezza Nov 25 '20 at 16:17
  • @AndrewHenle It's not obvious how the type of integer constants is defined. I think it's perfectly reasonable to assume that 2,147,483,648 is promoted to an unsigned int and then cause -2,147,483,648 to give the same compiler warning (though I agree using a short was a bad example). I don't understand how this has anything to do with parsing tokens, and that's what the response to the bug report focussed on. So yes, I do think it was blinkered. – jezza Nov 25 '20 at 16:29
  • @AndrewHenle Additionally, misunderstanding something and pressing for an explanation is not blinkered. Refusing to explain something properly (or admitting you don't know the answer) when asked, is blinkered. – jezza Nov 25 '20 at 17:10
  • @jezza That's your opinion. Look at the timestamps on those bug comments. Mine is that obstinately and even obnoxiously refusing to take the time to think much less even acknowledge **multiple** statements along the lines of "`-9223372036854775808LL` is actually two tokens" and "`9223372036854775808LL` doesn't fit into a `signed long long` hence the warning" is literally "blinkered" as in "a horse with blinkers who can't see". Given that some of those comments are mere minutes apart, you could even say the horse in this case was voluntarily blindfolded and **refused** to see. (Horse? close) – Andrew Henle Nov 25 '20 at 17:35
  • (cont) If you want to venture into the corner cases of the C language, you need to be precise and careful and be willing to spend a lot of time making **sure** what you are doing is proper. C is *extremely* unforgiving, and when you venture into corner cases like this, [here be dragons](https://en.wikipedia.org/wiki/Here_be_dragons). And if you don't take the time to understand, [those dragons **will** bite you](https://users.csc.calpoly.edu/~jdalbey/SWE/Papers/att_collapse). (and that's not even a real corner case...) – Andrew Henle Nov 25 '20 at 17:41

3 Answers3

8

This has to do with how the type of integer constants is defined.

First, as mentioned in the gcc bug report, -9223372036854775808LL is actually two tokens: the unary - operator and the integer constant 9223372036854775808LL. So the warning applies only to the latter.

Section 6.4.4.1p5 of the C standard states:

The type of an integer constant is the first of the corresponding list in which its value can be represented.

integer constant table

Based on this, a decimal integer constant with no suffix will have type int, long, or long long based on the value. These are all signed types. So anything value small enough to fit in an 8 bit or 16 bit type still has type int, and a value too large for a 32 bit signed int will have type long or long long depending on the size of the type on that system. The same goes for a constant with the LL suffix, but only the long long type is tried.

The warning comes up because the value you're using doesn't fit in the above type list. Any lesser value will result in the value having a signed type meaning there's no conversion to unsigned.

dbush
  • 205,898
  • 23
  • 218
  • 273
4

As various more or less confused people in the bug report said, the integer constant 9223372036854775808LL is too large to fit inside a long long.

For decimal constants, the standard has a list in 6.4.4.1 (see the answer by @dbush) describing what types the compiler will try to give to an integer constant. In this case, the only valid option for type is (signed) long long and it won't fit there. Then §6 under that table kicks in:

If an integer constant cannot be represented by any type in its list, it may have an extended integer type, if the extended integer type can represent its value. /--/
If the list contains both signed and unsigned types, the extended integer type may be signed or unsigned.

Extended integer type is a fuzzy but formal term in the standard. In this case the compiler apparenty tries to squeeze the constant into a unsigned long long "extended integer type" where it fits. This isn't really guaranteed behavior but implementation-defined.

Then the unary - operator is applied to the unsigned long long which produces the warning.

This is the reason why library headers such as limits.h like to define LLONG_MIN as

#define LLONG_MIN (-9223372036854775807LL - 1)

You could do something similar to avoid this warning. Or better yet, use LLONG_MIN.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    `unsigned long long` is not an extended integer type. The compiler is implementing a *confirming extension*, where the code is ill-formed but it accepts it anyway by using `unsigned long long`. Because the code is ill-formed, it's required to issue a diagnostic (a warning counts as a diagnostic). – Brian Bi Nov 25 '20 at 16:43
  • @Brian Talk to the person who cross-tagged this as C and C++. This answer is about C. There is nothing called conforming extension or ill-formed in C. – Lundin Nov 26 '20 at 07:14
0

Why is no warning given for the equivalent code for a 32/16/8-bit signed integer constant?

A constant is not limited 8, 16, or 32 bit. It is the first type that fits and decimal constants can go up to at least 63-bits.

9223372036854775808LL is outside OP's long long range as 9223372036854775808 takes 64-bits.

The - is applied after the constant is made.

On a int,long,long long as 32,32,64 bit implementation: -2147483648 is type long long, not int.


GCC and Clang both give this warning, so it is clearly intentional behavior and not just 'to make it easier to parse,' as is suggested in response to the bug report. Why?

No comment. Link was not informative. Best to data here.


Is this behavior mandated by the C/C++ standard? Some other standard?

Yes, by the C standard.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256