Possible Duplicates:
Previous power of 2
Getting the Leftmost Bit
What I want is, suppose there is a number 5
i.e. 101
. My answer should be 100
. For 9
i.e. 1001
, the answer should be 1000
Possible Duplicates:
Previous power of 2
Getting the Leftmost Bit
What I want is, suppose there is a number 5
i.e. 101
. My answer should be 100
. For 9
i.e. 1001
, the answer should be 1000
You can't ask for the fastest sequence without giving constrains on the machine on which this has to run. For example, some machines support an instruction called "count leading zeroes" or have means to emulate it very quickly. If you can access this instruction (for example with gcc) then you can write:
#include <limits.h>
#include <stdint.h>
uint32_t f(uint32_t x)
{
return ((uint64_t)1)<<(32-__builtin_clz(x)-1);
}
int main()
{
printf("=>%d\n",f(5));
printf("=>%d\n",f(9));
}
f(x) returns what you want (the least y with x>=y and y=2**n). The compiler will now generate the optimal code sequence for the target machine. For example, when compiling for a default x86_64 architecture, f() looks like this:
bsrl %edi, %edi
movl $31, %ecx
movl $1, %eax
xorl $31, %edi
subl %edi, %ecx
salq %cl, %rax
ret
You see, no loops here! 7 instructions, no branches.
But if I tell my compiler (gcc-4.5) to optimize for the machine I'm using right now (AMD Phenom-II), then this comes out for f():
bsrl %edi, %ecx
movl $1, %eax
salq %cl, %rax
ret
This is probably the fastest way to go for this machine.
EDIT: f(0) would have resulted in UB, I've fixed that (and the assembly). Also, uint32_t means that I can write 32 without feeling guilty :-)
From Hacker's Delight, a nice branchless solution:
uint32_t flp2 (uint32_t x)
{
x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
x = x | (x >> 16);
return x - (x >> 1);
}
This typically takes 12 instructions. You can do it in fewer if your CPU has a "count leading zeroes" instruction.
int input = 5;
std::size_t numBits = 0;
while(input)
{
input >>= 1;
numBits++;
}
int output = 1 << (numBits-1);
This is a task related to the bit counting. Check this out.
Using the 2a (which is my favorite of the algorithms; not the fastest) one can come up with this:
int highest_bit_mask (unsigned int n) {
while (n) {
if (n & (n-1)) {
n &= (n-1) ;
} else {
return n;
}
}
return 0;
}
The magic of n &= (n-1);
is that it removes from n
the least significant bit. (Corollary: n & (n-1)
is false only when n
has precisely one bit set.) The algorithm complexity depends on number of bits set in the input.
Check out the link anyway. It is a very amusing and enlightening read which might give you more ideas.