Sign extension with bitwise shift operation

Question

following this Q&A I tried to exam the answer so I wrote:

#include <stdio.h>

int main ()
{

        int t;int i;
        for (i=120;i<140;i++){
                t = (i - 128) >> 31;
                printf ("t = %X , i-128 = %X ,  ~t & i = %X , ~t = %X \n", t, i-128 , (~t &i), ~t);
        }

        return 0;
}

and the Output is:

t = FFFFFFFF , i-128 = FFFFFFF8 ,  ~t & i = 0 , ~t = 0 
t = FFFFFFFF , i-128 = FFFFFFF9 ,  ~t & i = 0 , ~t = 0 
t = FFFFFFFF , i-128 = FFFFFFFA ,  ~t & i = 0 , ~t = 0 
t = FFFFFFFF , i-128 = FFFFFFFB ,  ~t & i = 0 , ~t = 0 
t = FFFFFFFF , i-128 = FFFFFFFC ,  ~t & i = 0 , ~t = 0 
t = FFFFFFFF , i-128 = FFFFFFFD ,  ~t & i = 0 , ~t = 0 
t = FFFFFFFF , i-128 = FFFFFFFE ,  ~t & i = 0 , ~t = 0 
t = FFFFFFFF , i-128 = FFFFFFFF ,  ~t & i = 0 , ~t = 0 
t = 0 , i-128 = 0 ,  ~t & i = 80 , ~t = FFFFFFFF 
t = 0 , i-128 = 1 ,  ~t & i = 81 , ~t = FFFFFFFF 
t = 0 , i-128 = 2 ,  ~t & i = 82 , ~t = FFFFFFFF 
t = 0 , i-128 = 3 ,  ~t & i = 83 , ~t = FFFFFFFF 
t = 0 , i-128 = 4 ,  ~t & i = 84 , ~t = FFFFFFFF 
t = 0 , i-128 = 5 ,  ~t & i = 85 , ~t = FFFFFFFF 
t = 0 , i-128 = 6 ,  ~t & i = 86 , ~t = FFFFFFFF 
t = 0 , i-128 = 7 ,  ~t & i = 87 , ~t = FFFFFFFF 
t = 0 , i-128 = 8 ,  ~t & i = 88 , ~t = FFFFFFFF 
t = 0 , i-128 = 9 ,  ~t & i = 89 , ~t = FFFFFFFF 
t = 0 , i-128 = A ,  ~t & i = 8A , ~t = FFFFFFFF 
t = 0 , i-128 = B ,  ~t & i = 8B , ~t = FFFFFFFF

Why does ~t of any negative number is -1 == 0xFFFFFFFF if t declared as integer ?

`~0 == 0xFFFFFFFF`, `~~x==x`. You invert only zeroes and `-1`. (Assuming 32bit int) — Aneri, Mar 31 '13 at 13:22
@AlexeyFrunze @Aneri why does `t = (i - 128) >> 31` gives zero or -1 for each number ? — 0x90, Mar 31 '13 at 13:26
@0x90 because apparently your platform uses arithmetic right shifts — harold, Mar 31 '13 at 13:28
The standard says that right shifting a negative value is implementation defined. Read your compiler documentation to see what it does in this case... — DCoder, Mar 31 '13 at 13:29
@harold can you put it as an answer I think you have got the right answer — 0x90, Mar 31 '13 at 13:30
@0x90 ok just read my answer too and let me know if it does helps. — Grijesh Chauhan, Mar 31 '13 at 15:01
@0x90 because I have got comment may I know what are you using c or c++ you have taged both — Grijesh Chauhan, Mar 31 '13 at 19:08
@GrijeshChauhan, The question is for c and c++ as well, I posted the code only in c. — 0x90, Mar 31 '13 at 19:34
@0x90 ok, as James commented to my answer. Please consider my answer for C(its also valid in Java) I don't now that nature of `>>` in c++ is platform dependent. — Grijesh Chauhan, Mar 31 '13 at 19:37
@GrijeshChauhan I think James meant it can be overloaded and that is not the case in our context, assuming I understood him correctly. — 0x90, Mar 31 '13 at 19:38

score 6 · Accepted Answer · edited May 23 '17 at 11:44

From: Right shifting negative numbers in C

Edit: According to the Section 6.5.7 of the latest draft standard, this behavior on negative numbers is implementation dependent:

The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2^E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.

And, your implementation is probably doing an Arithmetic Shift with two's complement numbers

Operator >> as Signed right shift or arithmetic right shift, shift all the bits to right a specified number of times. Important is >> fills leftmost sign bit (Most Significant Bit MSB) to leftmost bit the after shift. This is called sign extension and serves to preserve the sign of negative numbers when you shift them right.

Below is my diagrammatic representation with an example to show how this works (for one byte):
Example:

i = -5 >> 3;  shift bits right three time

Five in two's complement form is 1111 1011 Memory Representation:

 MSB
+----+----+----+---+---+---+---+---+
|  1 |  1 | 1  | 1 | 1 | 0 | 1 | 1 |   
+----+----+----+---+---+---+---+---+
   7    6   5    4   3   2   1   0  
  ^  This seventh, the left most bit is SIGN bit

And below is, how >> works? When you do -5 >> 3

                        this 3 bits are shifted 
                         out and loss
 MSB                   (___________)      
+----+----+----+---+---+---+---+---+
|  1 |  1 | 1  | 1 | 1 | 0 | 1 | 1 |   
+----+----+----+---+---+---+---+---+
  | \                 \  
  |  ------------|     ----------|
  |              |               |
  ▼              ▼               ▼
+----+----+----+---+---+---+---+---+
|  1 |  1 | 1  | 1 | 1 | 1 | 1 | 1 |
+----+----+----+---+---+---+---+---+
(______________)
 The sign is        
 propagated

Notice: the left most three bits are one because on each shift sign bit is preserved and each bit is right too. I have written The sign is propagated because all this three bits are because of sign(but not data).

[ANSWER]
In your output in

first Eight lines

      ~t is 0
==>    t is FFFFFFFF 
==>    t is -1

(Note: 2's complement of -1 is FFFFFFFF, because 1 = 00000001, 1's complement of 1 is FFFFFFFE, and 2's complement = 1's complement + 1 that is: FFFFFFFE + 00000001 = FFFFFFFF)

So t is always evaluated -1 in first eight times in loop. Yes, How ?

In for loop

for (i=120;i<140;i++){
     t = (i - 128) >> 31;

values of i for first eight time is i = 120, 121, 122, 123, 124, 125, 126 ,127 all eight values are less then 128. So return of (i - 128) = -8, -7, -6, -5, -4, -3, -2, -1. hence in first eight times expression t = (i - 128) >> 31 shift-rights a negative number.

t =   (i - 128)  >> 31
t =  -ve number  >> 31

Because in your system int is of 4 bytes = 32 bits, all right most 31 bits are shift-out and loss and due to propagation of sign bit that is 1 for negative number all bits value becomes 1. (as I have shown in above figure for one byte)

So for fist eight times:

    t =  -ve number  >> 31 ==  -1 
    t = -1
  and this gives 
    ~t = 0

Hence output of fist eight times for ~t is 0.

For remaining last lines

      ~t is FFFFFFFF
==>   ~t is -1   
==>    t is 0

For remaining last lines, in for loop

for (i=120;i<140;i++){
     t = (i - 128) >> 31;

i values are 128, 129, 130, 132, 133, 134, 135, 136, 137, 138, 139, all are greater then or equals to 128. and sign bit is 0.

So (i - 128) for remaining last lines is >=0 and for all this MSB sign bit = 0. And because again you shifts right 31 times all bits excepts then sigh bit shift-out and sign bit 0 propagates and fills all bits with 0 and magnitude becomes 0.

I think it would be good if I write an example for a positive number too. So we take an example 5 >> 3 and five is one byte is 0000 0101

                        this 3 bits are shifted 
                         out and loss
 MSB                   (___________)      
+----+----+----+---+---+---+---+---+
|  0 |  0 | 0  | 0 | 0 | 1 | 0 | 1 |   
+----+----+----+---+---+---+---+---+
  | \                 \  
  |  ------------|     ----------|
  |              |               |
  ▼              ▼               ▼
+----+----+----+---+---+---+---+---+
|  0 |  0 | 0  | 0 | 0 | 0 | 0 | 0 |
+----+----+----+---+---+---+---+---+
(______________)
 The sign is        
 propagated

See again I writes The sign is propagated, So leftmost three zeros are due to sign bit.

So this is what operator >> Signed right shift do, and preserves the sign of left operand.

Actually, this answer is wrong for C++. Whether `>>` sign fills or zero fills is implementation dependent. — James Kanze, Mar 31 '13 at 15:20
@JamesKanze ok only for c++, OP using `ptintf` also header files are .h and also taged to `C` So I think answer is correct here. Yes but the linked question is from c++ and Java. and my answer is also correct for Java. So I think I should not remove my answer. — Grijesh Chauhan, Mar 31 '13 at 15:47
@JamesKanze You are right if the operator is overloaded but in the original question about the branch prediction `>>` is meant to be the naive one. — 0x90, Mar 31 '13 at 19:35
No. C++ just copies C here, and the C standard is quite clear. If the left hand operand has a signed type and is negative, the results are implementation defined. (The reason, I believe, is that not all processors have an arithmetic shift right instruction.) — James Kanze, Mar 31 '13 at 22:34
@JamesKanze Yes you are correct I was unaware of this. I Just found a link. Thanks :) — Grijesh Chauhan, Apr 01 '13 at 06:18

score 5 · Answer 2 · answered Mar 31 '13 at 13:48

Why does t = (i - 128) >> 31 gives zero or -1 for each number?

When a non-negative 32-bit integer is shifted right 31 positions, all non-zero bits get shifted out and the most significant bits get filled with 0, so you end up with 0.

Usually, when a negative 32-bit integer is shifted right 31 positions, the most significant bits don't get filled with 0, instead they get set to the sign of the number and so the sign propagates into all bits and in 2's complement representation all bits set to 1 amount to -1. The net effect is as if you repeatedly divided the number by 2, but with a little twist... The result is rounded towards -infinity instead of towards 0. For example -2>>1==-1 but -3>>1==-2 and -5>>1==-3. This is called arithmetic right shift.

When I say "usually", I mean the fact that the C standard allows several different behaviors for right shifts of negative values. On top of that, it allows non-2's complement representations of signed integers. Usually, however, you have the 2's complement representation and the behavior that I've showed/explained above.

score 2 · Answer 3 · answered Mar 31 '13 at 13:29

2

Becaue t is either 0 or -1, ~t is also always -1 or 0.

This is due to the (implementation defined) behaviour or (i - 128) >> 31, which essentially copies the top bit of (i-128) [assuming 32-bit integers]. If i is > 128, it will result in a zero in the top bit. If i is less than 128, the result is negative, so the top bit is set.

Since ~t is "all bits opposite of t", you can expect that t is always 0xffffffff if t is zero.

answered Mar 31 '13 at 13:29

Mats Petersson

126,704
14
140
227

The point here is that the shift is arithmetic and not logical shift since U declared it `int` and not `u32` or `unsigned int`. – 0x90 Mar 31 '13 at 13:33

anishsane · Answer 4 · 2013-03-31T14:36:52.627

The >> operator, right shift, is arithmetic right shift in most compilers, meaning divide-by-2.

So, if, e.g. int i ==-4 (0xfffffffc), then i>>1 == -2 (0xfffffffe).

Having said this, I would recommend you to check the assembly of your code.
e.g. x86 has 2 separate instructions - shr & sar, denoting logical shift & arithmetic shift respectively.
Generally, compilers use shr (logical shift) for unsigned variables & sar (arithmetic shift) for signed variables.

Below is the C code & corresponding assembly, generated with gcc -S:

a.c:

int x=10;
unsigned int y=10;

int main(){
    unsigned int z=(x>>1)+(y>>1);
    return 0;
}

a.s:

    .file   "a.c"
.globl x
    .data
    .align 4
    .type   x, @object
    .size   x, 4
x:
    .long   10
.globl y
    .align 4
    .type   y, @object
    .size   y, 4
y:
    .long   10
    .text
.globl main
    .type   main, @function
main:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $16, %esp
    movl    x, %eax
    sarl    %eax ; <~~~~~~~~~~~~~~~~ Arithmetic shift, for signed int
    movl    y, %edx
    shrl    %edx ; <~~~~~~~~~~~~~~~~ Logical shift, for unsigned int
    addl    %edx, %eax
    movl    %eax, -4(%ebp)
    movl    $0, %eax
    leave
    ret
    .size   main, .-main
    .ident  "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
    .section    .note.GNU-stack,"",@progbits

score 0 · Answer 5 · answered Mar 31 '13 at 19:43

The rule in C and C++ is that the result of a right-shift of a negative value is implementation defined. So read your compiler's documentation. The various explanations you've gotten are valid approaches, but none of these is mandated by the language definition.

Sign extension with bitwise shift operation

5 Answers5

Linked