3

I'm trying to perform a left circular shift (rol) under AMD64.

What is the equivalent intrinsic like the one provided by MSVC (_rotl64)?

Nocturnal
  • 683
  • 7
  • 25
  • There isn't one in VS: only _rotl8 and _rotl16. You could use the routines in http://www.devx.com/tips/Tip/14043 or inline assembler – cup Mar 11 '14 at 12:45
  • I tried to come up or look up any inline assembly to perform the desired task, but failed miserably. – Nocturnal Mar 11 '14 at 13:07
  • You might find that if you write the C code to do it, the compiler will notice you are rolling, and use the roll instruction. (I'd like to say "should" but I'm not that confident in the compilers' abilities!) – M.M Mar 11 '14 at 13:19
  • @MattMcNabb that's what I originally aimed for, but didn't get anything. – Nocturnal Mar 11 '14 at 13:24

1 Answers1

3
#include <stdint.h>

inline uint64_t rotl64 ( uint64_t x, int8_t r )
{
  return (x << r) | (x >> (64 - r));
}
Jun
  • 171
  • 4
  • 16
  • 1
    This is not single assembly instruction without clever optimization – vSzemkel Jan 15 '22 at 08:39
  • The peephole pass should get this one, it is so common. And both gcc and llvm going back for literally centuries both easily optimize this at -O2: https://godbolt.org/z/n81fE4bKf (they both even get it back at -O1 from the dawn of time., A card punch version of gcc would get this it even,) – JasonN Sep 03 '22 at 23:29
  • Slowing down compilation performance by expecting backwards-transformation rules into ROL/ROR instructions like that is a shameful demand by the developer community onto compiler vendors. The compiler is not supposed to improve faulty software models. The job is hard enough to optimize for the existing one! (the scientifical problem being the unnecessary code-graph-template detection logic over the entire graph TIMES the demanded templates where an intelligent choice would be smarter). – rplgn Jan 18 '23 at 10:44
  • @rplgn: I understand the argument, but this is exactly the job of the compilers. And hopefully, it's linear in complexity rather than combinatorial. If the compiler vendors struggle with this, they should get a solution standardized in C++/C. – Violet Giraffe Jan 29 '23 at 16:51