Inconsistencies in sign extension when shifting signed int vs short

Question

int main(){
  signed int a = 0b00000000001111111111111111111111; 
  signed int b = (a << 10) >> 10;
  // b is: 0b11111111111111111111111111111111

  signed short c = 0b0000000000111111; 
  signed short d = (c << 10) >> 10;
  // d is: 0b111111

  return 0;
}

Assuming int is 32 bits and short is 16 bits,

Why would b get sign extended but d does not get sign extended? I have tested this with gdb on x64, compiled with gcc.

In order to get short sign extended, I had to use two separate variables like this:

  signed short f = c << 10;
  signed short g = f >> 10;
  // g is: 0b1111111111111111

Just a guess, but possibly the optimizer realizes that `signed short d = (c << 10) >> 10;` has no observable behaviour. But using two variables means there is a sequence point so then there is? — Jerry Jeremiah, May 19 '21 at 21:22
If you print `sizeof(c<<10)` it prints the number of bytes in an int so it really is integer promotion. — Jerry Jeremiah, May 19 '21 at 21:37
Note that binary literals are not (yet — as of Feb 2020 draft standard) a part of standard C (but they are a part of standard C++). Using the `0b0001` type notation is using an extension to the standard that is widely supported but not standard. — Jonathan Leffler, May 19 '21 at 22:27

score 6 · Accepted Answer · answered May 19 '21 at 21:38

In the case of signed short, when an integer type smaller than int is used in an expression it is (in most cases) promoted to type int. This is spelled out in section 6.3.1.1p2 of the C standard:

The following may be used in an expression wherever an int or unsigned int may be used

An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.

A bit-field of type _Bool,int,signed int,or unsigned int.

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions All other types are unchanged by the integer promotions

And this promotion specifically happens in the case of bitwise shift operators as specified in section 6.5.7p3:

The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

So the short value 0x003f is promoted to the int value 0x0000003f and the left shift is applied. This results in 0x0000fc00, and the right shift gives a result of 0x0000003f.

The signed int case is a bit more interesting. In this case you're left-shifting a bit with the value 1 into the sign bit. This triggers undefined behavior as per 6.5.7p4:

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1×2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1×2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

So while the output you get for the signed int case is what you might expect it to be, it's actually undefined behavior and so you can't depend on that result.

@Dan Those are of a larger rank than `int`, so no promotion. — dbush, May 19 '21 at 22:41
The moral of the story is: Don't apply shift operations to signed integer types. These traps are the reason why the MISRA rules specifically forbid it - in the old MISRA-2004 standard, this is rule 12.7. — DavidHoadley, May 20 '21 at 06:31

Eric Postpischil · Answer 2 · 2021-05-19T21:34:57.523

2

short is automatically converted to int by the integer promotions, per C 2018 6.5.7 3:

The integer promotions are performed on each of the operands…

So (c << 10) shifts an int 0b111111 left 10 bits, yielding (in your C implementation) the 32-bit int 0b00000000000000001111110000000000. The sign bit in that is zero; it is a positive number.

When you do signed short f = c << 10;, the result of c << 10 is too big to fit in a signed short. It is 64,512, which is above the largest value your signed short can represent, 32,767. In an assignment, the value is converted to the type of the left operand. Per C 2018 6.3.1.3 3, the conversion is implementation-defined. GCC defines this conversion to wrap modulo 65,536 (two the power of the number of bits in the type). So converting 64,512 yields 64,512 − 65,536 = −1024. So f is set to −1024.

Then, in f >> 10, you are shifting a negative value. As signed short, f is still promoted to int, but this conversion keeps the value, resulting in an int value of −1024. This is then shifted. This shift is implementation-defined, and GCC defines it to shift with sign extension. So the result of -1024 >> 10 is −1.

edited May 19 '21 at 21:34

answered May 19 '21 at 21:24

Eric Postpischil

195,579
13
168
312

But I have asked to shift a `short` value, why is it promoting it to an int? and how can I force similar behaviour as int? – Dan May 19 '21 at 21:25
1

Explained [here](https://stackoverflow.com/a/46073296/509868) - look for "integer promotions". – anatolyg May 19 '21 at 21:28
2

@Dan: You cannot ask C to shift a `short` value. In shift expressions, and in most C expressions, `short` operands are automatically converted to `int`. There is no way in C to express a shift of a `short` value. (In a C implementation where `short` and `int` are the same width, you could get the same result, although technically the shift is still done on an `int` value.) – Eric Postpischil May 19 '21 at 21:28
How about larger values like int64? do the remain as is? – Dan May 19 '21 at 21:30

score 0 · Answer 3 · answered May 19 '21 at 21:33

For starters according to the C Standard (6.5.7 Bitwise shift operators)

3 The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand.

Thus this value

signed short c = 0b0000000000111111;

in the expression used in this declaration

signed short d = (c << 10) >> 10;

is promoted to the integer type int. As the value is positive then the promoted values is also positive.

Thus this operation

c << 10

does not touch the sign bit.

On the other hand this code snippet

signed int a = 0b00000000001111111111111111111111; 
signed int b = (a << 10) >> 10;

has undefined behavior because according to same section of the C Standard

4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

Inconsistencies in sign extension when shifting signed int vs short

3 Answers3