2

I would like to perform the following operation as quickly as possible

x / LSB(x)  

where x is an integral value unknown at compile time and LSB(x) = x & -x. (Alternatively, the operation is equivalent to an even division by the highest power of 2 <= x.) I am looking for a reasonably portable solution (without compiler intrinsics/builtins like GCC's __builtin_clz or alike).

My concern is that the following simple implementation

x / (x & -x)

would still result in an expensive division as compiler might fail to realize that the division is in fact equivalent to right-shift by the number of trailing zeroes in the divisor.

If my concerns are reasonable, what would be a more efficient way to implement it?

I would appreciate a solution that is easily extendible to integral types of sizes 32-bit, 64-bits, 128-bits, ...

eold
  • 5,972
  • 11
  • 56
  • 75
  • 1
    As far as I know you can't really do that yet, well except with a bunch of preprocessor stuff to select the right intrinsic. [N3864](http://open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3864.html) would help.. – harold Mar 26 '14 at 17:58
  • Are you coding in C or C++? Also, compilers are usually pretty good at spotting this stuff. – Puppy Mar 26 '14 at 19:00
  • C++. I mostly care about GCC, clang and MSVC++. – eold Mar 26 '14 at 19:19
  • What do you mean by even division? I don't see how a division by the highest power of 2 <= x can be equivalent to x / (x & -x). – Falk Hüffner Mar 27 '14 at 10:11
  • Even division = division without remainder. – eold Mar 27 '14 at 15:36
  • @FalkHüffner: Who said "highest power of 2 <= x"? This is the least significant set bit, not the most significant. Division by the MSB would always yield `1`. Oh, now I see that in the question. Yes, the "alternate" explanation is clearly wrong. – Ben Voigt Mar 27 '14 at 20:13
  • If you divide by non-MSB (e.g. highest bit) you *do* have remainder, and thus it is not considered even division. I see no problem with the alternate explanation. – eold Mar 28 '14 at 18:50

2 Answers2

1

How about

x >>= ffs(x)-1;

The ffs function conforms to 4.3BSD, POSIX.1-2001.

It won't work if x is 0.

pat
  • 12,587
  • 1
  • 23
  • 52
0

If you don't want to rely on CLZ (count leading zeros) hardware instructions, you can count leading zeros as described in this answer. It's very fast with a look-up and multiplication by a magic number. I'll re-post the code here:

unsigned x;  // input to clz
unsigned c;  // output of clz
static const unsigned MultiplyDeBruijnBitPosition[32] = 
{
  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 
  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};
c = MultiplyDeBruijnBitPosition[((unsigned)((x & -x) * 0x077CB531U)) >> 27];

Once you have counted the leading zeros, you no loner need to use a division instruction. Instead, you can just shift the value right by c. That is (eliminating an unneeded temporary value), the code becomes this:

static const unsigned MultiplyDeBruijnBitPosition[32] = 
{
  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 
  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};

x >>= MultiplyDeBruijnBitPosition[((unsigned)((x & -x) * 0x077CB531U)) >> 27]; // x /= LSB(x)  
Community
  • 1
  • 1
Apriori
  • 2,308
  • 15
  • 16