13

Is there an efficient way to find the log2 of a number assuming it's a power of 2. I know the obvious ways like having a table or

for (log2=0;x!=1;x>>=1,log2++);

But I am wondering if there is a more efficient/elegant way.

Paul R
  • 208,748
  • 37
  • 389
  • 560
Tohiko
  • 1,860
  • 2
  • 18
  • 26
  • 3
    Did you check math libraries? I bet there are some efficient functions – ssovukluk Nov 02 '17 at 11:27
  • 1
    What about log2(x) where x is the number for which you want to find the log of 2 ? – Bhawan Nov 02 '17 at 11:27
  • http://x86.renejeschke.de/html/file_module_x86_id_19.html – Sopel Nov 02 '17 at 11:32
  • 3
    but then you probably really just want `std::log2()` from ``. Then only worry about trying to find a more efficient one if you can demonstrate a bottleneck using a profiler. – underscore_d Nov 02 '17 at 11:38
  • 1
    Assuming `x` is a power of 2, you can immediately improve the worst-case running time using a binary search. Given the book-keeping overhead for right-shift / masking, this may not be much of an improvement. If the distribution for `x` is random, you should proceed from the most significant bit. – Brett Hale Nov 02 '17 at 12:07
  • 1
    Use a table; given 64 bit integers there are only 64 of of them. – Richard Critten Nov 02 '17 at 12:10
  • 2
    @RichardCritten - but how do you index that look-up table? – Brett Hale Nov 02 '17 at 12:11
  • 1
    The `for` loop looks reasonable enough that a decent optimizer might recognize it. Have you checked whether your compiler actually generates a loop? – MSalters Nov 02 '17 at 12:22
  • 1
    @BrettHale: You don't; binary search it, and then take the offset of the entry you found (!). – MSalters Nov 02 '17 at 12:24
  • 4
    another duplicate: [Fastest way of computing the power that a “power of 2” number used?](https://stackoverflow.com/q/21438631/995714) – phuclv Nov 02 '17 at 12:37
  • 1
    Actually, I'm interested in whether `std::log2()` might already use an optimised path in this case, though not interested enough to look it up personally right now. – underscore_d Nov 02 '17 at 13:05

2 Answers2

16

You can just count the number of leading or trailing zero bits, because any exact power-of-two is represented as a single 1 bit, with all other bits 0. Many CPUs have special instructions for doing this, and compilers such as gcc have intrinsics for these operations, which get compiled to a single instruction on appropriate architectures.

If you have an efficient clz ("count leading zeroes") then a log2 implementation might look like this:

int32_t ilog2(uint32_t x)
{
    return sizeof(uint32_t) * CHAR_BIT - clz(x) - 1;
}

(Note: returns -1 for ilog2(0).)

When using gcc or a gcc-compatible compiler you can just define clz like this:

#define clz(x) __builtin_clz(x)

Microsoft has something similar: BitScanReverse.

Note that it might appear more convenient to count trailing zeroes (using a ctz instruction), but a clz instruction is more widely available on different CPU architectures.

A further bonus of using clz rather than ctz is that you get floor(log2(x)) for non-power-of-2 values, making your ilog2 function more generally useful than if you had used ctz, which only works for exact powers-of-2.

See also: Finding the first set bit in a binary number.

Paul R
  • 208,748
  • 37
  • 389
  • 560
  • 1
    Isn't only ctz(num) sufficient to get the trailing zeros in the num ? – Bhawan Nov 02 '17 at 11:36
  • 1
    __builtin_ctz is a more straightforward option, surely... – davmac Nov 02 '17 at 11:36
  • 1
    Yes, not all CPUs have a `ctz` instruction though (e.g. ARM) - `clz` seems to be more common, in my experience. See: [Find first set](https://en.wikipedia.org/wiki/Find_first_set) Wikipedia entry for a good summary. – Paul R Nov 02 '17 at 11:36
  • 1
    Also see newly added note above re log2 for non-power-of-2 integer arguments. – Paul R Nov 02 '17 at 11:51
  • 1
    Since most people will be using Intel/AMD, the assembler mnemonics are ``bsr`` rsp. ``bsf``. By the way, I remember that they were fairly slow anyhow when they were introduced (probably implemented as a microassembler loop). – Arne Vogel Nov 02 '17 at 14:13
  • 2
    For ctz, I didn't check arm, but on aarch64 gcc generates rbit+clz, which shouldn't be much worse than 31-clz. – Marc Glisse Nov 03 '17 at 09:34
  • 1
    @MarcGlisse: fair enough - although you might still want to add handling for the 0 case (if needed), which you get for free when using `clz`. – Paul R Nov 03 '17 at 09:40
  • 2
    (0 is not a power of 2, but indeed people often need to handle power-of-2 or 0) gcc's doc says that __builtin_c[lt]z are both undefined with an argument of 0. – Marc Glisse Nov 03 '17 at 09:45
1

I haven't benchmarked this, but it ought to run reasonably fast since it doesn't require many iterations:

int which_power_of_2(uint64_t x) {
    uint64_t z = 0x0000000100000000ULL;
    int p = 31, d = 16;
    while (d) {
        if (x & (z-1)) {
            p -= d;
            z >>= d;
        }
        else {
            p += d;
            z <<= d;
        }
        d >>= 1;
    }
    return x ? ((x & z) ? p+1 : p) : -1;
}
r3mainer
  • 23,981
  • 3
  • 51
  • 88