Type of integer literals and ~ in C

Question

I'm a C beginner, and I'm confused by the following example found in the C answer book.

One way to find the size of unsigned long long on your system is to type:

printf("%llu", (unsigned long long) ~0);

I have no idea why this syntax works?

On my system, int are 32 bits, and long long are 64 bits.
What I expected was that, since 0 is a constant of type integer, ~0 calculates the negation of a 32-bits integer, which is then converted to an unsigned long long by the cast operator. This should give 2³² - 1 as a result.

Somehow, it looks like the ~ operator already knows that it should act on 64 bits?
Does the compiler interprets this instruction as printf("%llu", ~(unsigned long long)0); ? That doesn't sound right since the cast and ~ have the same priority.

score 14 · Accepted Answer · edited Jun 20 '20 at 09:12

Somehow, it looks like the ~ operator already knows that it should act on 64 bits?

It's not the ~ operator, it's the cast. Here is how the integer conversion is done according to the standard:

6.3.1.3 Signed and unsigned integers

When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

The value of signed int ~0 corresponds to -1 on systems with two's complement representation of negative values. It cannot be represented by an unsigned long long, so the first bullet point does not apply.

The second bullet point does apply: the new type is unsigned, so MAX of unsigned long long is added to -1 once to get the result into the range of unsigned long long. This has the same effect as sign-extending -1 to 64 bits.

The value of `~0` is dependent of the signed representation in the implementation. `~0` is different than `-1` in representations other than two's complement. — ouah, Feb 16 '15 at 15:47
So the expected behaviour would happen with `(unsigned long long)~0u`, right ? — Quentin, Feb 16 '15 at 16:56
@Quentin: That always depends on what you expect. Just avoid this one, as it's too implementation-dependent. — Deduplicator, Feb 16 '15 at 19:07

score 5 · Answer 2 · answered Feb 16 '15 at 15:29

5

0 is of type int, not unsigned int. ~0 will therefore (on machines that use two's complement integer representation, which is all that are in use today) be -1, not 2³² - 1.

Assuming a 64-bit unsigned long long, (unsigned long long) -1 is -1 modulo 2⁶⁴, which is 2⁶⁴ - 1.

answered Feb 16 '15 at 15:29

Wintermute

42,983
5
77
80

Shouldn't that be *implementation defined*, not for sure (casting signed value to longer unsigned type)? Yes, on real normal 2s-complement machines. In this case, it may use the most natural available instruction for widening the value, which might not be sign-extending. – JDługosz Feb 16 '15 at 22:07
@Wintermute fo you mean: `0` is of type `signed int` not `unsigned int` ... ? – umlcat Feb 17 '15 at 00:30
@umlcat `int` is `signed int`. – Wintermute Feb 17 '15 at 09:05

score 0 · Answer 3 · answered Feb 16 '15 at 15:30

0 is an int

~0 is still an int, namely the value -1.

Casting an int to unsigned long long is there merely to match the type that printf expects with the conversion llu.

However, the value of -1 extended an unsigned long long should be 0xffffffff for 4 byte int and 0xffffffffffffffff for 8 byte int.

score 0 · Answer 4 · edited May 23 '17 at 12:00

According to N1570 Committee Draft:

6.5.3.3 Unary arithmetic operators

The result of the ~ operator is the bitwise complement of its (promoted) operand (that is, each bit in the result is set if and only if the corresponding bit in the converted operand is not set). The integer promotions are performed on the operand, and the result has the promoted type. If the promoted type is an "unsigned type, the expression ~E is equivalent to the maximum value representable in that type minus E".

§6.2.6.2 Language 45:

(ones’ complement). Which of these applies is implementation-deﬁned, as is whether the value with sign bit 1 and all value bits zero (for the ﬁrst two), or with sign bit and all value bits 1 (for ones’ complement), is a trap representation or a normal value. In the case of sign and magnitude and ones’ complement, if this representation is a normal value it is called a negative zero.

Hence, the behavior of code:

printf("%llu", (unsigned long long) ~0);

On some machine is implementation-deﬁned and undeﬁned - not as per expected — depend on the internal representations of integers in machine.

And according to section 6.5.3.3, approved way to write code would be:

printf("%llu", (unsigned long long) ~0u);

Further, type of ~0u is unsigned int where as you are casting it to unsigned long long int for which format string is llu. To print ~0u using format string %u.

To learn basic concept of type casting you may like to read: What exactly is a type cast in C/C++?

Type of integer literals and ~ in C

4 Answers4

6.5.3.3 Unary arithmetic operators

§6.2.6.2 Language 45: