Bitwise shift operation in C on uint64_t variable

Question

I have the following sample code:

uint64_t x, y;
x = ~(0xF<<24);
y = ~(0xFF<<24);

The result would be:

x=0xfffffffff0ffffff
y=0xfffff

Can anyone explain the difference? Why x is calculated over 64 bit and y only on 32?

[Type of integer literals not int by default?](https://stackoverflow.com/q/8108642/995714), [What is the default type of integral literals represented in hex or octal in C++?](https://stackoverflow.com/q/38782709/995714) — phuclv, Oct 16 '18 at 10:28

ciphor · Accepted Answer · 2012-02-15T12:26:51.340

6

The default operation is 32 bit.

x=~(0xf<<24);

This code could be disassembled into the following steps:

int32_t a;
a=0x0000000f;
a<<=24;   // a=0x0f000000;
a=~a;     // a=0xf0ffffff;
x=(uint64_t)a;  // x = 0xfffffffff0ffffff;

And,

y = ~(0xFF<<24);

int32_t a;
a=0x000000ff;
a<<=24;   // a=0xff000000;
a=~a;     // a=0x00ffffff;
x=(uint64_t)a;  // x = 0x000000000ffffff;

edited Feb 15 '12 at 12:26

answered Feb 15 '12 at 09:39

ciphor

8,018
11
53
70

To be picky, the default is (signed) `int`, whatever that might be. – Lundin Feb 15 '12 at 09:57
And strictly speaking, 0xFF<<24 for a 32-bit system is undefined behavior and the result could be anything. – Lundin Feb 15 '12 at 10:52
I cannot agree. The behavior should be predictable. If 'a' has type 'char', then 0xff<<24 will result a negative value; otherwise, if 'a' has type short or int, it will be a positive value. – ciphor Feb 15 '12 at 12:29

unwind · Answer 2 · 2012-02-15T09:57:07.170

2

Because 0x0f << 24 is a positive number when viewed as an int, it's sign-extended to a positive number, i.e. to 0x00000000_0f000000 (the underscore is just for readability, C does not support this syntax). This is then inverted into what you're seeing.

0xff << 24 on the other hand is negative, so it's sign-extended differently.

edited Feb 15 '12 at 09:57

answered Feb 15 '12 at 09:39

unwind

391,730
64
469
606

Strictly speaking, 0xFF<<24 for a 32-bit system is undefined behavior and the result could be anything. – Lundin Feb 15 '12 at 10:52

Firedragon · Answer 3 · 2012-02-15T12:01:55.083

2

Other posters have shown why it does this. But to get the expected results:

uint64_t x, y; 
x = ~(0xFULL<<24); 
y = ~(0xFFULL<<24);

Or you can do this (I don't know if this is is any slower than the above though):

uint64_t x, y; 
x = ~(uint64_t(0xF)<<24); 
y = ~(uint64_t(0xFF)<<24);

Then:

x = 0xfffffffff0ffffff
y = 0xffffffff00ffffff

edited Feb 15 '12 at 12:01

answered Feb 15 '12 at 11:55

Firedragon

3,685
3
35
75

score 1 · Answer 4 · answered Feb 15 '12 at 10:50

You have undefined behavior in your program so anything might happen.

The integer literals 0xF or 0xFF are of type int, which is equivalent to signed int. On this particular platform, int is apparently 32 bits.
The integer literal 24 is also a (signed) int.
When the compiler evaluates the << operation, both operands are (signed) int so no implicit type promotions take place. The result of the << operation is therefore also a (signed) int.
The value 0xF<<24 = 0x0F000000 fits in a (signed) int as a non-negative value, so everything is ok.
The value 0xFF<<24 = 0xFF000000 does not fit in (signed) int! Here, undefined behavior is invoked and anything might happen.

ISO 9899:2011 6.5.7/4:

"The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros." /--/

"If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

So the expression 0xFF<<24 can't be used. The program is free to print any garbage value after that.

But if we ignore that one and focus on 0x0F<24:

0x0F000000 is still a (signed) int. The ~operator is applied to this.
The result is 0xF0FFFFFF, which is still a signed int. And on almost any system, this 32-bit hex equals a negative number in two's complement.
This signed int is converted to the type uint64_t during assignment. This is done in two steps, first by converting it to a signed 64 bit, then by converting that signed 64 to an unsigned 64.

Bugs like this is why the coding standard MISRA-C contains a number of rules to ban sloppy use of integer literals in expression like this. MISRA-C compliant code must use the u suffix after each integer literal (MISRA-C:2004 10.6) and the code is not allowed to perform bitwise operations on signed integers (MISRA-C:2004 12.7).

Bitwise shift operation in C on uint64_t variable

4 Answers4

Linked