C++ integer constant's type

Question

According to MSDN (Integer Types - VC2008):

The type for a decimal constant without a suffix is either int, long int, or unsigned long int. The first of these three types in which the constant's value can be represented is the type assigned to the constant.

Running the below code on Visual C++ 2008:

void verify_type(int a){printf("int [%i/%#x]\n", a, a);}
void verify_type(unsigned int a){printf("uint [%u/%#x]\n", a, a);}
void verify_type(long a){printf("long [%li/%#lx]\n", a, a);}
void verify_type(unsigned long a){printf("ulong [%lu/%#lx]\n", a, a);}
void verify_type(long long a){printf("long long [%lli/%#llx]\n", a, a);}
void verify_type(unsigned long long a){printf("unsigned long long [%llu/%#llx]\n", a, a);}

int _tmain(int argc, _TCHAR* argv[])
{
    printf("sizeof(int) %i\n", sizeof(int));
    printf("sizeof(long) %i\n", sizeof(long));
    printf("sizeof(long long) %i\n\n", sizeof(long long));

    verify_type(-2147483647);
    verify_type(-2147483648);

    getchar();
    return 0;
}

I get this:

sizeof(int) 4
sizeof(long) 4
sizeof(long long) 8

int [-2147483647/0x80000001]
ulong [2147483648/0x80000000]  <------ Why ulong?

I would expect const -2147483648 () to be int. Why do I get a ulong, not int?

I've been programming for quite a long time and until today I've not noticed that + or - is not part of integer constant. This one hint explained everything.

      integer-constant:
              decimal-constant integer-suffix<opt>
              octal-constant integer-suffix<opt>
              hexadecimal-constant integer-suffix<opt>

      decimal-constant:
              nonzero-digit
              decimal-constant digit

      octal-constant:
              0
              octal-constant octal-digit

      hexadecimal-constant:
              0x  hexadecimal-digit
              0X  hexadecimal-digit
              hexadecimal-constant hexadecimal-digit

      nonzero-digit: one of
              1  2  3  4  5  6  7  8  9

      octal-digit: one of
              0  1  2  3  4  5  6  7

      hexadecimal-digit: one of
              0  1  2  3  4  5  6  7  8  9
              a  b  c  d  e  f
              A  B  C  D  E  F

      integer-suffix:
              unsigned-suffix long-suffix<opt>
              long-suffix unsigned-suffix<opt>

      unsigned-suffix: one of
              u  U

      long-suffix: one of
              l  L

This one bug me. Also I have different results here : http://ideone.com/HyyI3r — Orace, Mar 23 '15 at 13:48
note that many of your `printf` statements cause undefined behaviour. You cannot pass a negative value to `%x` or `%lx`. — M.M, Mar 23 '15 at 13:51
There is no function overload in C, so the code can't compile in C. This is some C++ stuffs. — Orace, Mar 23 '15 at 13:55
@Orace you are using C++11 for those results, however OP is using a 2008 compiler. C++ did not have `long long` prior to C++11, so a compiler extension must be in play. Compiler extensions are supposed to be documented ... — M.M, Mar 23 '15 at 14:01
When represented as 32 bits, the values 2147483648 and -2147483648 have exactly the same value: `0b10000000000000000000000000000000`. How is the compiler to know which to use? — Evil Dog Pie, Mar 23 '15 at 14:08
@MattMcNabb, in old school C++ ideone give a `ulong`: http://ideone.com/WGlarH also answer are really good for this one. — Orace, Mar 23 '15 at 14:10
@Orace it doesn't say for sure, but presumably that is g++ 4.9.2 running in `--std=gnu89` mode which apparently also has this hybrid of standard integer constant rules, but also a 64-bit int type. — M.M, Mar 23 '15 at 14:14
@MikeofSST that's not an issue here , the question is what the *type* is of the given constant expressions. — M.M, Mar 23 '15 at 14:15
@MattMcNabb Yes. My thinking was that, considering the statement from MSDN in the question, the only way to unabiguously represent 0x80000000 is to use an unsigned long, *if the compiler doesn't know the difference between MSB set and sign bit set*, which it can't if the values are otherwise the same. (I can't quite get my head round it enough to explain in a half-decent way.) — Evil Dog Pie, Mar 23 '15 at 14:24
@MikeofSST: The compiler has a richer representation than just a 32 bits value. For instance, it also has a type (`unsigned long`). That unambiguously tells the compiler that the MSB is NOT a sign bit. The compiler also has an expression `operator-(unsigned long) unsigned_long_literal(2147483648)`. And no, that unary `operator-` does NOT return a signed long. — MSalters, Mar 23 '15 at 14:39
@MSalters According to MSDN [Fundamental Types](https://msdn.microsoft.com/en-us/library/cc953fe1(v=vs.90).aspx), `int`, `unsigned int`, `long` and `unsigned long` are all represented by 4 bytes. Since the negation operator is being used on a compile time constant, why would it not be done before the constant value is generated? This could result in the unambiguosly incorrect result observed. — Evil Dog Pie, Mar 23 '15 at 14:52
@MikeofSST: MSVC has no choice in this. The behavior is well-defined. As for "4 bytes", that's totally irrelevant. Overloading works on the exact type, not `sizeof(type)`. `void foo(int)` and `void foo(long)` are distinct functions. — MSalters, Mar 23 '15 at 14:58
possible duplicate of [Why it is different between -2147483648 and (int)-2147483648](http://stackoverflow.com/questions/12620753/why-it-is-different-between-2147483648-and-int-2147483648) — phuclv, Mar 23 '15 at 16:26
other duplicates: [(-2147483648> 0) returns true in C++?](http://stackoverflow.com/questions/14695118/2147483648-0-returns-true-in-c) [large negative integer literals](http://stackoverflow.com/questions/8511598/large-negative-integer-literals), [Type of integer literals not int by default?](http://stackoverflow.com/questions/8108642/type-of-integer-literals-not-int-by-default), [Casting minimum 32-bit integer (-2147483648) to float gives positive number (2147483648.0)](http://stackoverflow.com/questions/11536389/c-casting-minimum-32-bit-integer-2147483648-to-float-gives-positive-number?rq=1) — phuclv, Mar 23 '15 at 16:26
similar problem: [-32768 not fitting into a 16 bit signed value](http://stackoverflow.com/questions/26375337/32768-not-fitting-into-a-16-bit-signed-value?lq=1) — phuclv, Mar 23 '15 at 16:28

score 5 · Answer 1 · edited Jan 11 '16 at 19:20

5

You are applying the unary - operator to the integer literal 2147483648. The integer literal, being 2^31 is too large to fit in a 32-bit int. In modern C++, it should have been treated as a long long, so your result is surprising.

I believe old C standards (prior to long long) allowed interpreting a literal too large for long to have type unsigned long, which is consistent with what you're seeing. I see the documentation from MSDN you quoted at the top of your post repeats this, so that's surely what's going on here.

edited Jan 11 '16 at 19:20

Peter Mortensen

30,738
21
105
131

answered Mar 23 '15 at 13:46

1

C++ did not have `long long` prior to C++11. I'm sure VS2008 didn't have C++11. However, MS may have added their own int64 type as an extension and cooked up some way of handling constants. – M.M Mar 23 '15 at 13:50
1

The documentation linked by OP is titled "C constants" so perhaps it doesn't apply; however it is consistent with the output he is showing. Perhaps the compiler uses C++03 definition for constants up until 4294967295, and after that, switches to `long long`. Or something. – M.M Mar 23 '15 at 13:57
@MattMcNabb: IIRC, MSVC treated it the sane way. Any literal which was too big causes UB. One valid and reasonable form of UB is an integer literal having type `__int64`. – MSalters Mar 23 '15 at 15:01

score 5 · Answer 2 · answered Mar 23 '15 at 13:46

First, -2147483648 is not an integer constant, because - is a unary operator, not part of a constant (at least in that context). 2147483648 is an integer constant, and -2147483648 is an expression involving that constant.

Because 2147483648 is not representable as an int or long int, but is representable as an unsigned long int, it gets the type unsigned long int. And the result of applying the unary - operator to an unsigned long int is itself an unsigned long int.

score 5 · Accepted Answer · answered Mar 23 '15 at 13:47

5

-2147483648 is not an integer literal. It is the unary operator - applied to the integer literal 2147483648. That literal's value does not fit in a signed int or signed long, so it has type unsigned long. The - operator does not change that type.

answered Mar 23 '15 at 13:47

aschepler

70,891
9
107
161

C++ integer constant's type

3 Answers3