4

As in whether it falls within 2^3 - 2^4, 2^4 - 2^5, etc. The number returned would be the EXPONENT itself (minus an offset).

How could this be done extremely quickly and efficiently as possible? This function will be called a lot in a program that is EXTREMELY dependent on speed. This is my current code but it is far too inefficient as it uses a for loop.

static inline size_t getIndex(size_t numOfBytes)
{
    int i = 3;
    for (; i < 32; i++) 
    {
        if (numOfBytes < (1 << i)) 
            return i - OFFSET;
    }
    return (NUM_OF_BUCKETS - 1);
}

Thank you very much!

Jason Block
  • 155
  • 1
  • 3
  • 9

7 Answers7

9

What you're after is simply log2(n), as far as I can tell.

It might be worth cheating and using some inline assembly if your target architecture(s) have instructions that can do this. See the Wikipedia entry on "find first set" for lots of discussion and information about hardware support.

unwind
  • 391,730
  • 64
  • 469
  • 606
  • 3
    Also compilers might have support that implements this directly. E.g gcc has `__builtin_clz` (cleared leading zero) that counts the numbers of leading zeros. You'd have to test a bit which of the `clz`, `clzl` or `clzll` corresponds to your `size_t` type, but then it should just issue one assembler instruction for this. – Jens Gustedt Aug 14 '12 at 18:41
5

One way to do it would be to find the highest order bit that is set to 1. I'm trying to think if this is efficient though, since you'll still have to do n checks in worst case.

Maybe you could do a binary search style where you check if it's greater than 2^16, if so, check if it's greater than 2^24 (assuming 32 bits here), and if not, then check if it's greater than 2^20, etc... That would be log(n) checks, but I'm not sure of the efficiency of a bit check vs a full int comparison.

Could get some perf data on either.

Andrew Rasmussen
  • 14,912
  • 10
  • 45
  • 81
  • Is there a way to do that without right-shifting until you win? – John Aug 14 '12 at 18:12
  • @arasmussen: Yes, use any of the algorithms to test whether a number is a power of 2 and you can use it with slight modifications. Count the number of needed bits might be easiest like you say. – Andreas Reiff Aug 14 '12 at 18:12
  • Ahh ... binary sort. Forgot about that. – John Aug 14 '12 at 18:13
  • 2
    @John, the fastest way is usually by using compiler builtins which may take advantage of native CPU instructions. – eq- Aug 14 '12 at 18:18
1

There is a particularly efficient algorithm using de Bruijn sequences described on Sean Eron Anderson's excellent Bit Twiddling Hacks page:

uint32_t v; // find the log base 2 of 32-bit v
int r;      // result goes here

static const int MultiplyDeBruijnBitPosition[32] = 
{
  0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
  8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
};

v |= v >> 1; // first round down to one less than a power of 2 
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;

r = MultiplyDeBruijnBitPosition[(uint32_t)(v * 0x07C4ACDDU) >> 27];

It works in 13 operations without branching!

smocking
  • 3,689
  • 18
  • 22
1

You are basically trying to compute: floor(log2(x))

Take the logarithm to the base 2, then take the floor.

The most portable way to do this in C is to use the logf() function, which finds the log to the base e, then adjust: log2(x) == logf(x) / logf(2.0)

See the answer here: How to write log base(2) in c/c++

If you just cast the resulting float value to int, you compute floor() at the same time.

But, if it is available to you and you can use it, there is an extremely fast way to compute log2() of a floating point number: logbf()

From the man page:

   The inte-
   ger constant FLT_RADIX, defined in <float.h>, indicates the radix  used
   for  the  system's  floating-point  representation.  If FLT_RADIX is 2,
   logb(x) is equal to floor(log2(x)), except that it is probably faster.

http://linux.die.net/man/3/logb

If you think about how floating-point numbers are stored, you realize that the value floor(log2(x)) is part of the number, and if you just extract that value you are done. A little bit of shifting and bit-masking, and subtract the bias from the exponent (or technically the "significand") and there you have it. The fastest way possible to compute floor(log2(x)) for any float value x.

http://en.wikipedia.org/wiki/Single_precision

But actually logbf() converts the result to a float before giving it to you, and handles errors. If you write your own function to extract the exponent as an integer, it will be slightly faster and an integer is what you want anyway. If you wanted to write your own function you need to use a C union to gain access to the bits inside the float; trying to play with pointers will get you warnings or errors related to "type-punning", at least on GCC. I will give details on how to do this, if you ask. I have written this code before, as an inline function.

If you only have a small range of numbers to test, you could possibly cast your numbers to integer and then use a lookup table.

Community
  • 1
  • 1
steveha
  • 74,789
  • 21
  • 92
  • 117
  • For x=536870911, `logf(x) / logf(2.0)` returns 29 but should return 28. When the integer x is passed to logf, it is converted to a float, which cannot represent it accurately, so it is rounded. – Eric Postpischil Aug 15 '12 at 00:31
  • logbf depends on the floating-point radix. Commonly, IEEE 754 binary floating-point is in use, and the radix is 2. However, the standard C function frexp guarantees an exponent using base 2, so it is portable where logbf is not. – Eric Postpischil Aug 15 '12 at 00:39
  • If `536870911` is one of the inputs that must work correctly, then I guess instead of `logbf()`, the `logb()` function should be called (with a `double` value). And clearly if the radix is not 2, another solution completely will need to be used. I don't believe I have ever used a computer that does not have radix 2 float values; have you? What would that even be? – steveha Aug 15 '12 at 02:21
  • Yes, I have used several computers that did not use base 2, such as the Xerox Sigma 9, the VAX-11, the IBM System/360, and the HP-3000. Before IEEE 754, base 16 was not uncommon. Today, base 10 remains in use on some systems. Before my time, there was a base 3 computer. – Eric Postpischil Aug 15 '12 at 02:38
1

You can make use of floating number representation:

double n_bytes = numOfBytes

Taking the exponent bits should give you the result as floating numbers are represented as:

(-1)^S X (1. + M) X 2^E

Where: S - Sign M - Mantissa E - Exponent

To construct the mask and shift you would have to read about the exact bit pattern of the floating point type you are using.

The CPU floating point support does most of the work for you.

An even better way would be to use the built-in function:

double frexp (double x, int * exp );

Floating point representation

Xyand
  • 4,470
  • 4
  • 36
  • 63
0

This is generally fast on any machine with hardware floating point unit:

((union { float val; uint32_t repr; }){ x }.repr >> 23) - 0x7f

The only assumptions it makes are that floating point is IEEE and integer and floating point endianness match, both of which are true on basically all real-world systems (certainly all modern ones).

Edit: When I've used this in the past, I didn't need it for large numbers. Eric points out that it will give the wrong result for ints that don't fit in float. Here is a revised (albeit possibly slower) version that fixes that and supports values up to 52 bits (in particular, all 32-bit positive integer inputs):

((union { double val; uint64_t repr; }){ x }.repr >> 52) - 0x3ff

Also note that I'm assuming x is a positive (not just non-negative, also nonzero) number. If x is negative you'll get a bogus result, and if x is 0, you'll get a large negative result (approximating negative infinity as the logarithm).

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • For 33554431, this returns 25 but should return 24. When the integer x is used to initialize val, it is converted to a float, which cannot represent it accurately, so the value is rounded. – Eric Postpischil Aug 15 '12 at 00:35
0
#include <Limits.h>  // For CHAR_BIT.
#include <math.h>    // For frexp.
#include <stdio.h>   // For printing results, as a demonstration.


// These routines assume 0 < x.


/*  This requires GCC (or any other compiler that supplies __builtin_clz).  It
    should perform well on any machine with a count-leading-zeroes instruction
    or something similar.
*/
static int log2A(unsigned int x)
{
    return sizeof x * CHAR_BIT - 1 - __builtin_clz(x);
}


/*  This requires that a double be able to exactly represent any unsigned int.
    (This is true for 32-bit integers and 64-bit IEEE 754 floating-point.)  It
    might perform well on some machines and poorly on others.
*/
static int log2B(unsigned int x)
{
    int exponent;
    frexp(x, &exponent);
    return exponent - 1;
}


int main(void)
{
    // Demonstrate the routines.
    for (unsigned int x = 1; x; x <<= 1)
        printf("0x%08x:  log2A -> %2d, log2B -> %2d.\n", x, log2A(x), log2B(x));

    return 0;
}
Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312