2

I'm trying to implement a rotateRight by n function in C by only using bitwise operators.

So far, I have settled on using this.

y = x >> n
z = x << (32 - n)

g = y | z

So take for instance the value 11010011

If I were to try and `rotateRight(5):

y becomes 11111110

z becomes 01100000

Then g becomes 111111110

However the correct answer should be 10011110

This almost works, but the problem is that the right-shift copies the sign bit when I need it to perform logical shift, and so some of my answers are the negative of what they should be. How can I fix this?

Note I am unable to use casting or unsigned types

Andrew Brown
  • 167
  • 1
  • 7
krb686
  • 1,726
  • 9
  • 26
  • 46
  • 2
    Why you can´t use unsigned? "Teacher" would be the only acceptable reason to me... – deviantfan Feb 20 '14 at 00:33
  • 1
    Bear in mind that `>>` is implementation defined for negative `signed int` - i.e. you can't assume whether it's an arithmetic or logical shift unless you can guarantee the code's never going to see a different compiler or platform. – Notlikethat Feb 20 '14 at 01:41
  • Is it important that a shift of 0 works? – chux - Reinstate Monica Feb 20 '14 at 02:22
  • 1) A right shift that copies the sign bit is not a logical "bitwise operator" but an "arithmetic operator" as its function depends on "sign" 2) When an `int` is 32 bits, a left or right shift of 32 is not defined. 3) For portability, an `int` bit size is `sizeof(int)*CHAR_BIT`. – chux - Reinstate Monica Feb 20 '14 at 02:32
  • For the record: best-practices for expressing rotates in a compiler-friendly way, avoiding undefined behaviour: http://stackoverflow.com/questions/776508/circular-shift-rotate-operations-in-c. – Peter Cordes Aug 17 '15 at 17:36

3 Answers3

6

You could shift unsigned values:

y = (int)((unsigned)x >> n);
z = x << (32 - n);
g = y | z;

Or, you could mask appropriately:

y = (x >> n) & ~(-1 << (32 - n));
z = x << (32 - n);
g = y | z;
jlahd
  • 6,257
  • 1
  • 15
  • 21
  • Thanks, but apologies I should've included in the original post, I cannot cast or use unsigned types – krb686 Feb 20 '14 at 00:32
  • 2
    Your answer is correct and exactly what I wanted to say but I would ask you to please elaborate since it seems that he is a beginner. I would assume that he does not understand the difference between *shift right* and *shift arithmetic right* and would be inclined to provide links to articles about them and include an explanation in the answer as to how `int` and `unsigned int` play a part there. – nonsensickle Feb 20 '14 at 00:57
  • Alternatively, with your permission, I will gladly add this to your answer in an edit. – nonsensickle Feb 20 '14 at 00:58
  • A good point; if you already have good links to such articles, please go ahead and make the edit. Thanks! – jlahd Feb 20 '14 at 01:01
  • @nonsensickle Thanks. I think I understand the difference between the two types of shifts. Logical right copies 0s, and arithmetic right copies the sign bit and retains the same value for 2's complement, right? Complex answers with shifts and bitmasks on the other hand are quite difficult for me to come up with in my head. – krb686 Feb 20 '14 at 01:05
  • No worries, glad I could be of help :) – nonsensickle Feb 20 '14 at 01:23
  • 1
    It might be worth mentioning there is a rotate instruction in x86, and that I've looked at the assembly MSVC generates when provided code like this and it does resolve to one rotate instruction. – Apriori Feb 20 '14 at 17:37
  • `x << (32-n)` will cause undefined behaviour (signed arithmetic overflow), you also need to cast `x` to `unsigned` on that line. – M.M Jun 12 '15 at 03:40
3

Though @jlahd answer is correct I will try and provide a brief explanation of the difference between a logical shift right and an arithmetic shift right (another nice diagram of the difference can be found here).

Please read the links first and then if you're still confused read below:

Brief explanation of the two different shifts right

Now, if you declare your variable as int x = 8; the C compiler knows that this number is signed and when you use a shift operator like this:

int x = 8;
int y = -8;
int shifted_x, shifted_y;

shifted_x = x >> 2; // After this operation shifted_x == 2
shifted_y = y >> 2; // After this operation shifted_y == -2

The reason for this is that a shift right represents a division by a power of 2.

Now, I'm lazy so lets make int's on my hypothetical machine 8 bits so I can save myself some writing. In binary 8 and -8 would look like this:

 8 = 00001000
-8 = 11111000 ( invert and add 1 for complement 2 representation )

But in computing the binary number 11111000 is 248 in decimal. It can only represent -8 if we remember that that variable has a sign...

If we want to keep the nice property of a shift where the shift represents a division by a power of 2 (this is really useful) and we want to now have signed numbers, we need to make two different types of right shifts because

 248 >> 1 = 124 = 01111100
 -8  >> 1 = -4  = 11111100
// And for comparison
  8  >> 1 =  4  = 00000100

We can see that the first shift inserted a 0 at the front while the second shift inserted a 1. This is because of the difference between the signed numbers and unsigned numbers, in two's complement representation, when dividing by a power of 2.

To keep this nicety we have two different right shift operators for signed and unsigned variables. In assembly you can explicitly state which you wish to use while in C the compiler decides for you based on the declared type.

Code generalisation

I would write the code a little differently in an attempt to keep myself at least a little platform agnostic.

#define ROTR(x,n) (((x) >> (n)) | ((x) << ((sizeof(x) * 8) - (n))))
#define ROTR(x,n) (((x) >> (n)) | ((x) << ((sizeof(x) * 8) - (n))))

This is a little better but you still have to remember to keep the variables unsigned when using this macro. I could try casting the macro like this:

#define ROTR(x,n) (((size_t)(x) >> (n)) | ((size_t)(x) << ((sizeof(x) * 8) - (n))))
#define ROTR(x,n) (((size_t)(x) >> (n)) | ((size_t)(x) << ((sizeof(x) * 8) - (n))))

but now I'm assuming that you're never going to try and rotate an integer larger than size_t...

In order to get rid of the upper bits of the right shift which may be 1's or 0's depending on the type of shift the compiler chooses one might try the following (which satisfies your no casting requirement):

#define ROTR(x,n) ((((x) >> (n)) & (~(0u) >> (n))) | ((x) << ((sizeof(x) * 8) - (n))))
#define ROTR(x,n) ((((x) >> (n)) & (~(0u) >> (n))) | ((x) << ((sizeof(x) * 8) - (n))))

But it would not work as expected for the long type since the ~(0u) is of type unsigned int (first type which zero fits in the table) and hence restricts us to rotations that are less than sizeof(unsigned int) * 8 bits. In which case we could use ~(0ul) but that makes it of unsigned long type and this type may be inefficient on your platform and what do we do if you want to pass in a long long? We would need it to be of the same type as x and we could achieve it by doing more magical expressions like ~((x)^(x)), but we would still need to turn it into and unsigned version so lets not go there.

@MattMcNabb also points out in the comments two more problems:

  1. our left shift operation could overflow. When operating on signed types, even though in practice it is most often the same, we need to cast the x in the left shift operation to an unsigned type, because it is undefined behavior when an arithmetic shift operation overflows (see this answer's reference to the standard). But if we cast it we will once again need to pick a suitable type for the cast because its size in bytes will act as an upper limit on what we can rotate...

  2. We are assuming that bytes have 8 bits. Which is not always the case, and we should use CHAR_BIT instead of 8.

In which case why bother? Why not go back to the previous solution and just use the largest integer type, uintmax_t (C99), instead of size_t. But this now means that we could be penalized in performance since we might be using integers larger than the processor word and that could involve more than just one assembly instruction per arithmetic operation... Nevertheless here it is:

#define ROTR(x,n) (((uintmax_t)(x) >> (n)) | ((uintmax_t)(x) << ((sizeof(x) * CHAR_BIT) - (n))))
#define ROTR(x,n) (((uintmax_t)(x) >> (n)) | ((uintmax_t)(x) << ((sizeof(x) * CHAR_BIT) - (n))))

So really, there is likely no perfect way to do it (at least none that I can think of). You can either have it work for all types or have it be fast by only dealing with things equal to or smaller than the processor word (eliminate long long and the likes). But this is nice and generic and should adhere to the standard...

If you want fast algorithms there is a point where you need to know what machine/s you're writing code for otherwise you can't optimize.

So in the end @jlahd's solution will work better, whilst my one might help you make things more generic (at a cost).

Community
  • 1
  • 1
nonsensickle
  • 4,438
  • 2
  • 34
  • 61
  • This seems the cleanest of the bunch, however I would suggest putting parens around the 'n' in each case; as this squelches some compiler warnings. – Albinofrenchy Jun 11 '15 at 18:51
  • Yeah, you're right. Might as well be consistent. Thanks :) – nonsensickle Jun 12 '15 at 02:47
  • All of the `8` should be `CHAR_BIT` . I'm assuming you're going for portability here! – M.M Jun 12 '15 at 03:40
  • The `x` in the left shift needs to be made unsigned, otherwise you cause undefined behaviour by signed arithmetic overflow. – M.M Jun 12 '15 at 03:41
  • @MattMcNabb You're right about the left shift but I don't like the `CHAR_BIT`. Though according to [this answer](http://stackoverflow.com/a/2215454/1667513) it is equivalent to a byte in C99, it still implies that this code has something to do with a `char`, in plain English, which it doesn't. It deals with bytes. – nonsensickle Jun 12 '15 at 09:41
  • @nonsensickle in C, `char` means one byte. `CHAR_BIT` is the number of bits in a byte. – M.M Jun 12 '15 at 10:56
  • @MattMcNabb Ok, I've added in `CHAR_BIT` (reluctantly). Though I don't like its style I can't argue its correctness. Similarly, I've added a reference to `uintmax_t` and explained my reasoning a little more. Hopefully it is more compliant now. I appreciate any/all feedback. – nonsensickle Jun 12 '15 at 11:15
0

I've tried your code on x86 Linux with gcc 4.6.3.

y = x >> n
z = x << (32 - n)

g = y | z

This works correct.If x equals 11010011 then rotateRight(5) will makes y become 00000110.">>" will not add 1.

Ezio
  • 1,116
  • 4
  • 13
  • 25