17

I'm puzzled by a difference in behavior between MSVC and clang for this code:

#include <iostream>
#include <cstdint>

int main() {
  int64_t wat = -2147483648;
  std::cout << "0x" << std::hex << wat << std::endl;

  return 0;
}

Visual Studio 2010, 2012, and 2013 (on Windows 7) all display:

0x80000000

But clang 503.0.40 (on OSX) displays:

0xffffffff80000000

What is the correct behavior according to the C++ standard? Should the literal be zero-extended or sign-extended to 64 bits?

I'm aware that int64_t wat = -2147483648LL; will lead to that same result on both compilers, but I wonder about the proper behavior without the literal suffix.

Josh Peterson
  • 2,299
  • 19
  • 21
  • 1
    I have a feeling this changed in C++11 to give the more sensible behaviour you're seeing above with clang - I'm sure a language lawyer will be along soon to quote chapter and verse... – Paul R Sep 02 '14 at 14:39
  • @Paul R: The "non-sensible" behavior existed only in C89/90, where the compiler had to try `int`, `long int` and the `unsigned long int`. The inclusion of unsigned type was a bad decision, which was abandoned in C099 and in the very first version of C++ (C++98). Now the compiler has to try only signed types, in order of increasing width. In this case the only difference of C++11 is the addition of `long long int`. It is purely quantitative. – AnT stands with Russia Sep 02 '14 at 14:51
  • OK - thanks - maybe I was getting mixed up between C and C++. FWIW I always add the explicit suffix anyway, so never have to think about what the behaviour should be. – Paul R Sep 02 '14 at 14:53

1 Answers1

13

The type on the left of the initialization does not matter. The expression -2147483648 is interpreted by itself, independently.

Literal 2147483648 has no suffixes, which means that the compiler will first make an attempt to interpret it as an int value. On your platform int is apparently a 32-bit type. Value 2147483648 falls outside the range of signed 32-bit integer type. The compiler is required to use wider signed integer type to represent the value, if one is available (in the int, long int, long long int sequence, with the last one being formally available from C++11 on). But if no sufficiently wide signed integer type is available, the behavior is undefined

In MSVC historically long int has the same width as int, albeit 64-bit long long int is also supported in later versions of MSVC. However, even in VS2013 (which supports long long int) I get

warning C4146: unary minus operator applied to unsigned type, result still unsigned

in response to your initialization. This means that MSVC still sticks to archaic C89/90 rules of integer literal interpretation (where types were chosen from int, long int, unsigned long int sequence).

Note that MSVC is not officially a C++11 compiler. So formally it does not have to try long long int. Such type does not formally exist in pre-C++11 language. From that point of view, MSVC has no sufficiently large signed integer type to use in this case and the behavior is undefined. Within the freedom provided by undefined behavior, the use of C89/90 rules for integer literal interpretation is perfectly justifiable.

You might also take a look at this (-2147483648> 0) returns true in C++?

Community
  • 1
  • 1
AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • I think it depends on whether you're compiling as C++11 or not? – Paul R Sep 02 '14 at 14:43
  • @Paul R: How? Since C++98 "suffixless" integer literals had to be tried for `int` and `long int`. In MSVC both normally have the same width. – AnT stands with Russia Sep 02 '14 at 14:47
  • The command line I'm using for MSVC is `cl wat.cpp /EHsc`, so no special flags other than the one for exception handling. I don't think there is a C++11 flag in MSVC. – Josh Peterson Sep 02 '14 at 14:50
  • I'm not an expert on this, but I think there was an SO question on this very subject within the last few days - I'll see if I can find it... – Paul R Sep 02 '14 at 14:50
  • OK - here it is: http://stackoverflow.com/questions/25605777/c-literal-postfix-u-ul-problems/25606226#25606226 - I was evidently thinking about C, not C++ - sorry for the noise... – Paul R Sep 02 '14 at 14:57
  • @AndreyT Thanks for the information about the warning. When I use `/Wall` on the command line I see it. Here is the MSDN page for that warning; http://msdn.microsoft.com/en-us/library/4kh09110.aspx – Josh Peterson Sep 02 '14 at 15:03
  • @Josh Peterson: I retract my assertion that this is a bug in MSVC. It would be a bug in C++11 compiler, but MSVC is not officially a C++11 compiler yet. From the pre-C++11 point of view the MSVC's behavior is formally legal. – AnT stands with Russia Sep 02 '14 at 15:09
  • This still doesn't answer the question about how a 64-bit literal should be represented. Unless that wasn't the OP's question, in which case, the question title should be changed to fit. – Cole Tobin Sep 02 '14 at 22:06
  • @ColeJohnson I think this does answer the question. In C++11, a 64-bit integer literal should be represented using a `long long int`. Thanks to warning C4146 we know that, MSVC is instead using `unsigned long int` which is a valid choice for a pre-C++11 compiler. – Josh Peterson Sep 03 '14 at 10:26