why can't you shift a uint16_t

Question

I am trying to fill a 64-bit unsigned variable by combining 16-bit and 8-bit values:

uint8_t byte0 = 0x00;
uint8_t byte1 = 0xAA;
uint8_t byte2 = 0x00;
uint8_t byte3 = 0xAA;

uint16_t hword0 = 0xAA00;
uint16_t hword1 = 0xAAAA;

uint64_t result = ( hword0 << 32 ) + ( byte3 << 24 ) + 
                  ( byte2 << 16 ) + ( byte1 << 8 ) + ( byte0 << 0  );

This gives me a warning.

 left shift count >= width of type [-Wshift-count-overflow]
 uint64_t result = ( hword0 << 32 )

For example, in `byte3 << 24` `byte3` is only 8 bits long, but you're trying to shift it by 24 bits, which won't fit, so they'll get lost — ForceBru, Mar 08 '19 at 16:04

Jean-Baptiste Yunès · Answer 1 · 2019-03-08T16:48:42.623

7

hword0 is 16 bits long and you request for a 32 bit shift. Shifting more than the number of bits - 1 is undefined.

Solution is to convert your components to the destination type : uint64_t result = ( ((uint64_t)hword0) << 32 ) + etc.

edited Mar 08 '19 at 16:48

answered Mar 08 '19 at 16:07

Jean-Baptiste Yunès

34,548
4
48
69

2

It's only undefined for signed integers, right? – Fiddling Bits Mar 08 '19 at 16:07
@FiddlingBits right, need to rephrase it for the right case! – Jean-Baptiste Yunès Mar 08 '19 at 16:08
3

[It is undefined behavior for unsigned too](https://stackoverflow.com/a/18918340/5522303) – Kevin Mar 08 '19 at 16:11
@Kevin For C too? – Fiddling Bits Mar 08 '19 at 16:12
@kevin No only for right shift. Left shift of unsigned is well defined. – Jean-Baptiste Yunès Mar 08 '19 at 16:12
@FiddlingBits I believe so. It's a limitation of architectures that use a limited number of bits to store the shift operand. – Kevin Mar 08 '19 at 16:13
@Jean-BaptisteYunès Ah you might be right. I thought it was undefined for both. – Kevin Mar 08 '19 at 16:13
@Kevin No, C left-shift on unsigned is modulo. – Andrew Henle Mar 08 '19 at 16:14
1

@Kevin I apologize for my mistake, you were right! shifting more than the size is undefined behavior. – Jean-Baptiste Yunès Mar 08 '19 at 16:50
@AndrewHenle Not module, UB. Re: "If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined." §6.5.7 Independent of r/l, sign-ness. – chux - Reinstate Monica Mar 08 '19 at 17:50
Detail: In C, "Shifting more than the number of bits - 1" --> _in the promoted type_ "is undefined". – chux - Reinstate Monica Mar 08 '19 at 18:47

glglgl · Answer 2 · 2019-03-08T16:33:39.357

5

As opposed to your question tile, you can shift a uint16_t. But you cannot shift it (losslessly) by more than its width.

Your input operand's type is applied to the output operand as well, so in your original question, you have a uint16_t << 32 which is 0 (because any value shifted by 32 to the left and then clipped to 16 bits is 0), and so are nearly all of your uint8_t values.

The solution is simple: before shifting, cast your values to the appropriate type suitable for shifting:

uint64_t result = ( (uint64_t)hword0 << 32 ) + 
( (uint32_t)byte3 << 24 ) + ( (uint32_t)byte2 << 16 ) + ( (uint32_t)byte1 << 8  ) + ( (uint32_t)byte0 << 0  );

edited Mar 08 '19 at 16:33

answered Mar 08 '19 at 16:06

glglgl

89,107
13
149
217

1

Re: With 32-bit `int` and `uint16_t << 32`, the "... and then clipped to 16 bits is ..." mis-leads. In C, the value is not "clipped" to 16-bits - the operand's type in this shift is `int`. `uint16_t << 32` is UB because the shift is not in the 0-31 range (`int` width). OTOH, the solution is good. – chux - Reinstate Monica Mar 08 '19 at 18:43
@chux 'the operand's type in this shift is `int`' No. As was already discussed, `<<` is one of the operands which don't implicitly convert their argument to `int`. And if it would, it wpuld be `unsigned int`, where `<< 32` is not UB. – glglgl Mar 08 '19 at 21:08
1

C, as post is tagged, specifies "6.5.7 Bitwise shift operators "The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand" On a 32 bit `int` system, `uint16_t` is promoted to `int`, to `unsigned` on a 16-bit one. This answer's link addresses "usual arithmetic conversions" – chux - Reinstate Monica Mar 08 '19 at 21:41

score 4 · Accepted Answer · edited Mar 08 '19 at 18:33

You can shift a uint16_t. What you can't do is shift an integer value by a number greater than or equal to the size of the type. Doing so invokes undefined behavior. This is documented in section 6.5.7p3 of the C standard regarding bitwise shift operators:

The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

You would think that this means that any shift greater than or equal to 16 on a uint16_t is not valid. However, as mentioned above the operands of the << operator are subject to integer promotion. This means that any value with a rank lower than int is promoted to int before being used in an expression. So if int is 32 bits on your system, then you can left shift up to 31 bits.

This is why ( byte3 << 24 ) + ( byte2 << 16 ) + ( byte1 << 8 ) + ( byte0 << 0 ) don't generate a warning even though byte is a uint8_t while ( hword0 << 32 ) is not. There is still an issue here however because of the promotion to int. Because the promoted value is now signed, you run the risk of shifting a 1 into the sign bit. Doing so invokes undefined behavior as well.

To fix this, any value that is shifted left by 32 or more must be first casted to uint64_t so that the value can be operated on properly, as well as any value that may end up shifting a 1 into the sign bit:

 uint64_t result = ( (uint64_t)hword0 << 32 ) + 
    ( (uint64_t)byte3 << 24 ) + ( (uint64_t)byte2 << 16 ) + 
    ( (uint64_t)byte1 << 8  ) + ( byte0 << 0  );

score 2 · Answer 4 · edited Jun 20 '20 at 09:12

According to the warning, 32 bits is more or equal to the size of the operand on the target system. The C++ standard says:

[expr.shift]

The operands shall be of integral or unscoped enumeration type and integral promotions are performed.The type of the result is that of the promoted left operand. The behavior is undefined if the right operandis negative, or greater than or equal to the length in bits of the promoted left operand.

Corresponding rule from the C standard:

Bitwise shift operators

The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

According to the quoted rule, the behaviour of the your program is undefined whether it is written in C or C++.

You can solve the problem by explicitly converting the left hand operand of the shift to a sufficient large unsigned type.

P.S. On systems where uint16_t is smaller than int (which is quite typical), a uint16_t oprand will be promoted to int when used as an arithmetic operand. As such, byte2 << 16 is not unconditionally^† undefined on such systems. You shouldn't rely on this detail, but that explains why you see no warning from the compiler regarding that shift.

^† byte2 << 16 can still be undefined if the result is outside the range of representable values of the (signed) int type. It would be well defined if the promoted type was unsigned.

@Jean-BaptisteYunès that contradicts what the standard says. Do you have a standard rule to back that up? — eerorika, Mar 08 '19 at 16:32
You're right, I misinterpreted the paragraph that follows in the standard. This one is the right one at the first place! — Jean-Baptiste Yunès, Mar 08 '19 at 16:47

Andrew Henle · Answer 5 · 2019-03-08T17:02:36.373

0

byte2 << 16

is left-shifting an 8-byte value 16 bytes. That won't work. Per 6.5.7 Bitwise shift operators, paragraph 4 of the C standard:

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 x 2E2 , reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 x 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

Since you're using a left shift on unsigned values, you get zero.

EDIT

Per paragraph 3 of the same section, it's actually undefined behavior:

If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

You want something like

( ( uint64_t ) byte2 ) << 16

The cast to a 64-bit value will ensure the result doesn't lose bits.

edited Mar 08 '19 at 17:02

answered Mar 08 '19 at 16:06

Andrew Henle

32,625
3
24
56

Is it really that way, or are all operands converted to `unsigned int` first? In this case, only the first operand in the sum would be affected. – glglgl Mar 08 '19 at 16:17
@glglgl See https://stackoverflow.com/questions/3482262/bitshift-and-integer-promotion – Andrew Henle Mar 08 '19 at 16:25
Oh, thanks. This helps me. – glglgl Mar 08 '19 at 16:31
"is left-shifting an 8-byte value 16 bytes." and "using a left shift on unsigned values, you get zero." --> In C, the 8-byte value is first promoted to `int`. Then the _signed_ shift and width of `int` rules apply. – chux - Reinstate Monica Mar 08 '19 at 17:59

2785528 · Answer 6 · 2019-03-08T17:03:01.607

To do what you want to do, the key idea is to use intermediate uint64_t (the final size) in which to shuffle bits.

The following compiles with no warnings:

you can use auto promotion (and no cast)

     {
        uint64_t b4567 = hword0; // auto promotion
        uint64_t b3    = byte3;
        uint64_t b2    = byte2;
        uint64_t b1    = byte1;
        uint64_t b0    = byte0;

        uint64_t result =  (
           (b4567 << 32) |
           (b3    << 24) |
           (b2    << 16) |
           (b1    <<  8) |
           (b0    <<  0)   );
     }

you can also use static cast (multiple times):

     {
        uint64_t result = (
           (static_cast<uint64_t>(hword0) << 32) |
           (static_cast<uint64_t>(byte3)  << 24) |
           (static_cast<uint64_t>(byte2)  << 16) |
           (static_cast<uint64_t>(byte1)  <<  8) |
           (static_cast<uint64_t>(byte0)  << 0 )
           );

        cout << "\n  " << hex << result << endl;
     }

And you can do both by creating a function to a) perform the static cast and b) with a formal parameter to get the compiler to auto-promote.

function looks like:

//           vvvvvvvv ---- formal parameter
uint64_t sc (uint64_t ui64) {
  return static_cast<uint64_t>(ui64);
}


     // using static cast function
     {
        uint64_t result = (
           (sc(hword0) << 32) |
           (sc(byte3)  << 24) |
           (sc(byte2)  << 16) |
           (sc(byte1)  <<  8) |
           (sc(byte0)  <<  0)
           );

        cout << "\n  " << hex << result << endl;
     }

From my hardware background, when bit shifting, I prefer | instead of + (and & for masking) — 2785528, Mar 08 '19 at 16:55

chux - Reinstate Monica · Answer 7 · 2019-03-08T18:36:25.917

From a C perspective:

Much discussion here omits that a uint8_t applied to a shift (left or right) is first promoted to an int, and then the shift rules are applied.

Same occurs with uint16_t when int is 32-bit. (17 bit or more)

When int is 32-bit

hword0 << 32 is UB due to the shift amount too great: outside 0 to 31.
byte3 << 24 is UB when attempting to shift into the sign bit. byte3 & 0x80 is true.

Other shifts are OK.

Had int been 64-bit, OP's original code is fine - no UB, including hword0 << 32.

Had int been 16-bit, all of code's shifts (aside from << 0) are UB or potential UB.

To do this, without casting (Something I try to avoid), consider

// uint64_t result = (hword0 << 32) + (byte3 << 24) + (byte2 << 16) + (byte1 << 8) + byte0

// Let an optimizing compiler do its job
uint64_t result = hword0;
result <<= 8;
result += byte3;
result <<= 8;
result += byte2;
result <<= 8;
result += byte1;
result <<= 8;
result += byte0;

Or

uint64_t result = (1ull*hword0 << 32) + (1ul*byte3 << 24) + (1ul*byte2 << 16) +
    (1u*byte1 << 8) + byte0;

why can't you shift a uint16_t

7 Answers7

Linked