How does the following code work and what do the variables mean:
y = (x << shift) | (x >> (sizeof(x)*CHAR_BIT - shift));
I found in a circular shift article but with no explanation on how this works.
How does the following code work and what do the variables mean:
y = (x << shift) | (x >> (sizeof(x)*CHAR_BIT - shift));
I found in a circular shift article but with no explanation on how this works.
This is a method of doing a circular shift. Suppose that x
is 8 bits.
+----+----+----+----+----+----+----+----+ | x1 x2 x3 x4 x5 x6 x7 x8 | +----+----+----+----+----+----+----+----+
Then, shifting it left by 3 gives us:
+----+----+----+----+----+----+----+----+ | x4 x5 x6 x7 x8 0 0 0 | +----+----+----+----+----+----+----+----+
Now, CHAR_BIT*sizeof(x)
is the same as the width of x
in bits, 8. So shifting x
to the right by 8 - 3
gives us:
+----+----+----+----+----+----+----+----+ | 0 0 0 0 0 x1 x2 x3 | +----+----+----+----+----+----+----+----+
And taking the OR you get:
+----+----+----+----+----+----+----+----+ | x4 x5 x6 x7 x8 x1 x2 x3 | +----+----+----+----+----+----+----+----+
It is called a circular shift or "rotation" because the bits that get shifted out on the left get shifted back in on the right.
This code can actually invoke undefined behavior if one of the shifts is equal to or larger than the width of the promoted type. Fortunately, there’s an easy fix…
y = (x << shift) |
(x >> ((sizeof(x) * CHAR_BIT - shift) %
(sizeof(x) * CHAR_BIT)));
Sophisticated compilers will actually compile the code down to a hardware rotation instruction. For example,
#include <limits.h>
unsigned rotate(unsigned x, unsigned shift) {
return (x << shift) |
(x >> ((sizeof(x) * CHAR_BIT - shift) %
(sizeof(x) * CHAR_BIT)));
}
On Godbolt, you can try this out and see the generated code. At -O
with GCC 12 on x86, the result is rol
:
rotate:
mov eax, edi
mov ecx, esi
rol eax, cl
ret
CHAR_BIT
is the number of bits per byte, should be 8 always.
shift
is the number of bits you want to shift left in a circular fashion, so the bits that get shifted out left, come back on the right.
1110 0000 << 2 results in:
1000 0011
code for the example:
y = (x << 2) | (x >> (8 - 2));
(x << shift)
Shifts it 'shift' number of bits to the left, returns the shifted out bits
(x >> (sizeof(x)*CHAR_BIT - shift));
Makes space for accommodating those bits
CHAR_BIT
is the number of bits in char, so is 8 mostly.
In C, you don't handle one bit at a time, but at a minimum, char number of bits. So that is the granularity you get.
In general,
For a char, when you do a bit-rotate, you would do it on an 8-bit field (1 byte)
For an int, when you do a rotate, you would do it on a 32-bit field (4 bytes)
Example with 8 bits:
x = 11010101
shift = 2
x << (shift) = 01010100 //shifted left by 2 bits
= x >> ((1 * CHAR_BIT) - shift)
= x >> (6)
= 00000011 //shifted right by 6bits
OR
these bit-wise to give
01010100 //x << 2
00000011 //x >> 6
________
01010111
That is the circular shifted value by 2 bits
This works with unsigned types only. In the case with a signed negative number most left bits will be substituted by the value of most significant bit (with 1-s) by the right-shift operator (">>")
I'd write it like this:
y = (x << shift) | ( (x >> (sizeof(x)*CHAR_BIT - shift)) & (0x7F >> (sizeof(x)*CHAR_BIT - shift) );
In here before "|" operator we do confirm that first n bits ( n = sizeof(x)*CHAR_BIT - shift) are zeroed. We also assume, that x is short (2-bytes long). So, it's also type-dependent.