Why are the results different when I did bitwise right shift in C language?

Question

I wrote the following code to understand the bit-wise right shift operation，

unsigned long a = 0;
unsigned long b = 0xFFFFFFFF;
a = ~a;                   // now a is equal to b
a = a >> 1; 
b = b >> 1; 
printf("a = %x\n", a);    // result: a = 0xFFFFFFFF
printf("b = %x\n", b);    // result: b = 0x7FFFFFFF

I though that before the right shift operation a and b were both equal to 0xFFFFFFFF, and then after the right shift the results ought to be identical. But the results showed that

a = 0xFFFFFFFF
b = 0x7FFFFFFF

It seemed the MSB of a got 1, while the MSB of b got 0. Why did this difference happen?

Could be a compiler bug. Tried this with VS2019 and got the expected results. Could you share what toolchain you are using? - Also for the sake of experiment, could you swap out the shifting of a and b? — junix, Jan 28 '21 at 08:38
don't type mismatch: use `"%lx"` ... `printf("a = %lx\n", a);` ... same for `b`, of course. — pmg, Jan 28 '21 at 08:39
What is `pringf`? Please copy paste the actual code used instead of something you wrote up just now... I bet the real code has a bug not present in the fake code. — Lundin, Jan 28 '21 at 08:43
@junix not a compiler bug but PEBKAC. VS2019 suffers from a "feature" called LLP64. — Antti Haapala -- Слава Україні, Jan 28 '21 at 11:48
@AnttiHaapala Firstly: That's exactly why I said "could" and was asking for the toolchain in use. Secondly: Was it you closing the question with a link to a barely related question? — junix, Jan 29 '21 at 20:08
@junix Thanks for helping me editing this question. Like some answers have pointed out, I just used the wrong specifier. `"%x"` doesn't show all the data of an `"unsigned long"`. It should be `"%lx"`. So it's not a compiler bug. Btw, I used Eclipse and MinGW to compile my code, since you asked. — Zhaoxi ZHOU, Jan 30 '21 at 04:22

klutt · Accepted Answer · 2021-01-28T14:10:18.827

3

If you fix the wrong specifier and add some extra printouts, you'll see that your assumption is wrong:

#include <stdio.h>

int main(void) {
    unsigned long a = 0;
    unsigned long b = 0xFFFFFFFF;
    printf("a = %lx\n", a);   
    printf("b = %lx\n", b);    
    a = ~a;                   // now a is equal to b

    printf("a == b? %s\n", a == b ? "Yes" : "No");

    printf("a = %lx\n", a);   
    printf("b = %lx\n", b);   
    a = a >> 1; 
    b = b >> 1; 
    printf("a = %lx\n", a);    // result: a = 0xFFFFFFFF
    printf("b = %lx\n", b);    // result: b = 0x7FFFFFFF
}

Output:

$ ./a.out 
a = 0
b = ffffffff
a == b? No
a = ffffffffffffffff
b = ffffffff
a = 7fffffffffffffff
b = 7fffffff

Remember to compile with -Wall -Wextra to catch bugs like this.

Solution: Change to: unsigned long b = -1; Another solution: Include limits.h and use ULONG_MAX.

It's worth noting that long does not have a fixed width. If you want fixed width, then use types like uint32_t.

edited Jan 28 '21 at 14:10

answered Jan 28 '21 at 08:44

klutt

30,332
17
55
95

It's worth adding, that this highly depends on the toolchain in use. IIRC `unsigned long` is not of a standardized word width and very often bound to 32 bit – junix Jan 28 '21 at 08:50
Instead of using `-1`, include `` and use `ULONG_MAX` which is guaranteed to be the correct maximum value for `unsigned long`. – koder Jan 28 '21 at 08:58
For (optional) "standardized word width" you can use `uint_32` and `printf("%" PRIx32 "\n", a);` ... see https://ideone.com/WFsUPh – pmg Jan 28 '21 at 09:00
1

@koder If I'm not mistaken, also the `-1` is guaranteed to yield that result? – klutt Jan 28 '21 at 09:31
@klutt, `-1` happens to match an all 1 bit pattern in all implementations I know of. But as far as I know, the C standard makes no such guarantee. – koder Jan 28 '21 at 09:45
@koder Seems like it does https://stackoverflow.com/q/65934534/6699433 – klutt Jan 28 '21 at 09:52
@koder Besides, the bit pattern is not relevant here. Look at Lundins answer: https://stackoverflow.com/a/65936573/6699433 – klutt Jan 28 '21 at 11:57
@koder: The conversion of −1 to an unsigned integer type is defined in C 2018 6.3.1.3 2 based solely on value (it has nothing to do with how numbers are represented in bits) and is such that the conversion of −1 to an unsigned type necessarily produces “the maximum value that can be represented in the new type.” – Eric Postpischil Jan 28 '21 at 13:36
So yes, -1 is guaranteed to convert nicely to the maximum `unsigned long` value but it requires a lot of reading the standard to prove it. That same standard provides `ULONG_MAX` as the maximum value representable by `unsigned long`. So why not use that? – koder Jan 28 '21 at 14:06
1

@koder One reason is to avoid code duplication. Suppose you have `unsigned long x = ULONG_MAX;` and later you realize that it should be `unsigned long long` instead. Will you remember to ALSO change to `ULLONG_MAX`? – klutt Jan 28 '21 at 14:09
I'd agree that avoiding "magic numbers" in favour of named constants is good practice, regardless of what guarantees the C standard makes. Regarding the code repetition argument, I'd say that the correct solution is to use `uint32_t` or `uint64_t` instead. It's also harder to make an accidental typo mistake between `UINT32_MAX` -> `UINT64_MAX` than `ULONG...` -> `ULLONG`. – Lundin Jan 28 '21 at 14:15
@klutt Thanks. I made a mistake about the length of `unsigned long`. – Zhaoxi ZHOU Jan 29 '21 at 06:37

Why are the results different when I did bitwise right shift in C language?

1 Answers1