4

Here is two code that appears to be doing same thing,but It does not. These two different when run and compared output with tracing gives confusion as it appears that the 1st code processing is machine dependent code. Please read the two codes

Code 1:--

unsigned char c=(((~0 << 3) >> 4) << 1);
printf("%d", c);

Output:-- 254

Code 2:--

unsigned char c=(~0 << 3);
c >>= 4;
c <<= 1;
printf("%d", c);

Output:-. 30

The Output of the above code is different.

Not only this code (1st code) giving confusion but all types of code involving single line multiple bitwise shift operator gives unexpected results.

2nd code is doing correct.

Please run this code on your machine and verify above output

AND / OR

Explain why these output are not same.

OR

Finally we have to learn that we should not apply multiple bitwise shift operator in our code.

Thanks

  • 3
    The difference is in types. See what happens with `unsigned char c=((unsigned char) ((unsigned char) (~0 << 3) >> 4) << 1);` – Thomas Jager May 24 '19 at 14:13
  • 1
    there are default arithmetic conversions that are involved in coercion that make the difference. – alinsoar May 24 '19 at 14:14
  • The calculation for the first code is done using type `int` so bits other than the least significant 8 bits are preserved. With the second, the assignments drop all except the least significant 8 bits, thus changing the calculation completely. – Jonathan Leffler May 24 '19 at 14:20
  • @JohnKugelman Plenty of study material in SO's little [C FAQ](https://stackoverflow.com/tags/c/info) including [Implicit type promotion rules](https://stackoverflow.com/questions/46073295/implicit-type-promotion-rules), [What are bitwise shift (bit-shift) operators and how do they work?](https://stackoverflow.com/questions/141525/what-are-bitwise-shift-bit-shift-operators-and-how-do-they-work) and [Undefined, unspecified and implementation-defined behavior](https://stackoverflow.com/questions/2397984/undefined-unspecified-and-implementation-defined-behavior). – Lundin May 24 '19 at 14:45
  • Though in this case there's not many implicit conversions to write home about. There's a lvalue conversion to unsigned char. Otherwise, the type of the result of a shift expression is that of its (promoted) left operand. – Lundin May 24 '19 at 14:48

4 Answers4

10

~0 << 3 is always a bug, neither example is correct.

  • 0 is of type int which is signed.
  • ~0 will convert the binary contents to all ones: 0xFF...FF.
  • When you left shift data into the sign bit of a signed integer, you invoke undefined behavior. Same thing if you left shift a negative integer.

Conclusion: neither example has deterministic output and both can crash or print garbage.

Lundin
  • 195,001
  • 40
  • 254
  • 396
2

First, ~0 << 3 invokes undefined behavior because ~0 is a signed integer value with all bits set to 1 and you subsequently left shift into the sign bit.

Changing this to ~0u << 3 prevents UB but prints the same result, so the question is why.

So first we have this:

~0u

Which has type unsigned int. This is at least 16 bits so the value is:

0xffff

Then this:

`~0u << 3`

Gives you:

0xfff8

Then this:

((~0 << 3) >> 4)

Gives you:

0x0fff

And this:

(((~0 << 3) >> 4) << 1)

Gives you:

0x1ffe

Assigning this value to an unsigned char effectively trim it down to the low order byte:

0xfe

So it prints 254.

Now in the second case you start with this:

unsigned char c = (~0 << 3);

From above, this assigns 0xfff8 to c which gets truncated to 0xf8. Then >> 4 gives you 0x0f and << 1 gives you 0x1e which is 30.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • 1
    The problem is, you can't reason about it like you do, because if there is a signed type involved, there's not just UB, but also implementation-defined behavior of right-shifting 0xFF...FF. It could be logic or arithmetic shift. But in your example, you force logical shift by changing the type to unsigned, so you have changed the meaning of the code. – Lundin May 24 '19 at 14:38
  • @Lundin maybe in general, but in this case it doesn’t change what gets shifted into the lowest order byte so the result is the same. – dbush May 24 '19 at 15:15
0

I compiled (with x86-64 gcc 9.1) these two lines:

int main() {
    unsigned char e=(~0 << 1);

    unsigned char d=(((~0 << 3) >> 4) << 1);
}

And I got the following assembly output:

main:
        push    rbp
        mov     rbp, rsp
        mov     BYTE PTR [rbp-1], -2
        mov     BYTE PTR [rbp-2], -2
        mov     eax, 0
        pop     rbp
        ret

As you can see, both lines are converted to the same instruction mov BYTE PTR [rbp-1], -2. So, it seems the compiler is making an optimization with your first code.

Alejandro Blasco
  • 1,295
  • 2
  • 20
  • 24
0

Thanks to Thomas Jager for his answer (given on question comment)

The solution is simple.

In 1st code, the bit manipulation is performed with taking operand as signed char. Because of this, two complement binary number continue to change its bit pattern as the bit manipulation is in process. After that the result two complement number is converted to positive number before assigning to unsigned variable c. Hence Result is 254 finally.

The question is to explain why two code output is different. We all know Code 2nd is working good. Hence i explain only why Code 1 is working incorrectly.

1st Code : -

unsigned char c=(((~0 << 3) >> 4) << 1);
printf("%d", c);

The tracing of 1st code are as follows : -

Step 1:   ~0 -----> -1 ----(binary form)----> 11111111 with sign bit 1 (means negative)


Step 2:   (sign bit 1)11111111 << 3 -----shifting to left----> (sign bit 1)11111000


Step 3 ***:   (sign bit 1)11111000 >> 4 ----shifing to right-----> (sign bit 1)11111111

*[*** - The left most bits is 1 in Result because of sign extension 
     Sign bit 1 retain its bit to 1 but right shifting the number will append 1 to 
left most bits without modify sign bit to 0 . 
     Hence all left most bit append to 1 because sign bit 1 is supplying this 1 to 
left most bits while right shifting                                          ]*


Step 4:    (sign bit 1)11111111 << 1 ---shifting to left---> (sign bit 1)11111110


Step 5:    two complement number (sign bit 1)11111110 converted to positive number 
            by deleting only sign bit to 0.


Step 6:    Result : (sign bit 0)11111110 ---decimal equivalent---> 254

I only just explain his answer.

Thanks to all for giving effort for answer of this question.