10

I have a question as described: how to perform rotate shift in C without embedded assembly. To be more concrete, how to rotate shift a 32-bit int.

I'm now solving this problem with the help of type long long int, but I think it a little bit ugly and wanna know whether there is a more elegant method.

Kind regards.

Summer_More_More_Tea
  • 12,740
  • 12
  • 51
  • 83

3 Answers3

16

(warning to future readers): Wikipedia's code produces sub-optimal asm (gcc includes a branch or cmov). See Best practices for circular shift (rotate) operations in C++ for efficient UB-free rotates.


From Wikipedia:

unsigned int _rotl(unsigned int value, int shift) {
    if ((shift &= 31) == 0)
      return value;
    return (value << shift) | (value >> (32 - shift));
}

unsigned int _rotr(unsigned int value, int shift) {
    if ((shift &= 31) == 0)
      return value;
    return (value >> shift) | (value << (32 - shift));
}
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Bertrand Marron
  • 21,501
  • 8
  • 58
  • 94
  • If the target has a `rotl` instruction, do you think that the final assembly would use it? If not (which I suspect), I suppose there is no other way than inline assembly? – Gauthier May 18 '10 at 11:35
  • I don't know whether the wikipedia is right, but when I go over the rotate operation, it seems that some flaws exists if simply perform (value >> shift ) | ( value << ( 32 - shift ) ). NOTICE: the right shift operation in C is signed-extended. – Summer_More_More_Tea May 21 '10 at 13:13
  • 2
    The right shift should only be sign-extended if the variable being shifted is signed. If your 'value' variable is signed, then you'll need to cast it to unsigned for the right shift. – Chris Waters May 23 '10 at 02:35
  • 3
    gcc -O3 (4.9.2) compiles both of these to use a rotate. It unfortunately doesn't eliminate conditional, though! readable version: https://goo.gl/ex5R7q. Unreadable version here: `_rotr` -> `movl %esi, %ecx / movl %edi, %eax / andl $31, %ecx / rorl %cl, %eax / testl %ecx, %ecx / cmove %edi, %eax / ret` (AMD64 ABI: first arg in `%rdi`, 2nd in `%rsi`, return in `%eax`). Even with `-O0`, gcc turns the pair of shifts into a rotate insn. – Peter Cordes Jul 12 '15 at 23:16
4

This answer is a duplicate of what I posted on Best-practices for compiler-friendly rotates.

See my answer on another question for the full details.

The most compiler-friendly way to express a rotate in C that avoids any Undefined Behaviour seems to be John Regehr's implementation:

uint32_t rotl32 (uint32_t x, unsigned int n)
{
  const unsigned int mask = (CHAR_BIT*sizeof(x)-1);

  assert ( (n<=mask) &&"rotate by type width or more");
  n &= mask;  // avoid undef behaviour with NDEBUG.  0 overhead for most types / compilers
  return (x<<n) | (x>>( (-n)&mask ));
}

Works for any integer type, not just uint32_t, so you could make multiple versions. This version inlines to a single rol %cl, reg (or rol $imm8, reg) on x86, because the compiler knows that the instruction already has the mask operation built-in.

I would recommend against templating this on the operand type, because you don't want to accidentally do a rotate of the wrong width, when you had a 16bit value stored in an int temporary. Especially since integer-promotion rules can turn the result of an expression involving a narrow unsigned type into and int.

Make sure you use unsigned types for x and the return value, or else it won't be a rotate. (gcc does arithmetic right shifts, shifting in copies of the sign-bit rather than zeroes, leading to a problem when you OR the two shifted values together.)

Community
  • 1
  • 1
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • @chux: I could relax the type for the rotate count... `gcc` generates better code on x86 when `n` is a 32 or 64b type, not 16 or 8, but just using `int` or `unsigned` could be ok there. Good catch on the logic error in my assert(), though. – Peter Cordes Aug 17 '15 at 18:24
  • 1
    @chux: oh, silly me. I was thinking you said `n` and `x`, not `n` and `mask`. I picked `uint32_t` for `n` after experimenting with 64, 32, 16, and 8 bit types for it, and finding that gcc emitted a redundant `AND` instruction in the fewest cases with `uint32_t n`. – Peter Cordes Aug 17 '15 at 18:28
  • @chux: agreed. `unsigned` is already 32bit on the compilers / platforms I looked at the asm output for. I updated my answer, thanks for your help perfecting it. – Peter Cordes Aug 17 '15 at 18:44
  • Time for clean-up. Now it is better than http://stackoverflow.com/a/776523/2410359 – chux - Reinstate Monica Aug 17 '15 at 18:44
  • @chux: yup, working on it :) I was thinking about changing variable names, to `n` = value being rotated, `c` = rotate count. But then it wouldn't match John Regehr's original, and some of the above comments would be nonsense. But many other questions use `n` as the value to rotate. – Peter Cordes Aug 17 '15 at 18:50
1

Though thread is old I wanted to add my two cents to the discussion and propose my solution of the problem. Hope it's worth a look but if I'm wrong correct me please.

When I was looking for efficient and safe way to rotate I was actually surprised that there is no real solution to that. I found few relevant threads here: https://blog.regehr.org/archives/1063 (Safe, Efficient, and Portable Rotate in C/C++), Best practices for circular shift (rotate) operations in C++ and wikipedia style (which involves branching but is safe):

uint32_t wikipedia_rotl(uint32_t value, int shift) {
    if ((shift &= 31) == 0)
      return value;
    return (value << shift) | (value >> (32 - shift));
}

After little bit of contemplation I discovered that modulo division would fit the criteria as the resulting reminder is always lower than divisor which perfectly fits the condition of shift<32 without branching. From mathematical point of view:

∀ x>=0, y: (x mod y) < y

In our case every (x % 32) < 32 which is exactly what we want to achieve. (And yes, I have checked that empirically and it always is <32)

#include <stdint.h>
uint32_t rotl32b_i1m (uint32_t x, uint32_t shift)
    {
    shift %= 32;
    return (x<<shift) | (x>>(32-shift));
    }

Additionally mod() will simplify the process because actual rotation of, let's say 100 bits is rotating full 32 bits 3 times, which essentially changes nothing and then 4 bits. So isn't it better to calculate 100%32==4 and rotate 4 bits? It takes single processor operation anyway and brings it to rotation of constant value plus one instruction, ok two as argument has to be taken from stack, but it's still better than branching with if() like in "wikipedia" way.

So, what you guys think of that?

tansy
  • 486
  • 3
  • 8
  • Welcome to Stack Overflow! This doesn't really answer the question. If you're looking for feedback, you might want to write up your samples in a more-complete fashion and ask for critique over at [codereview.se]. Be sure to read [A guide to Code Review for Stack Overflow users](//codereview.meta.stackexchange.com/a/5778) first, as some things are done differently over there! – Toby Speight Aug 02 '17 at 15:49
  • `shift %= 32` is exactly the same as John Regehr's `shift &= 31`. A base-2 modulo can be and is done by masking away the high bits. (Compilers know this, so they will compile your version the same, at least when optimization is enabled.) Except you forgot to apply it to `(32-shift)`, so your version has C undefined behaviour for `shift=0` (although on most compilers it will just compile to a rotate.) – Peter Cordes Aug 02 '17 at 19:23
  • Yes, actually you're right - I thought what will happen with shift=32 but didn't analyse if shift=0. In the same time it will cause the same undefined behaviour with `shift&=31' which formally in C/C++ disqualifies both versions and force to use "wikipedia" version (with checking condition & will produce branch). I checked with gcc that generates fairly safe code using roll: # gcc -g -O2 -S rot321m1.c _rotl32b_i1m: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax movl 12(%ebp), %ecx andl $31, %ecx roll %cl, %eax leave ret // same for John Regehr's shift&=31 version – tansy Aug 03 '17 at 23:13
  • error: ‘n’ was not declared in this scope. Did you mean n=shift? Or am I missing something? – AnnoyinC Jan 05 '21 at 07:58
  • @AnnoyinC: yes you're right, should be _shift_: `return (x<>(32-shift));`. – tansy Mar 11 '21 at 00:51