10

Possible Duplicate:
Compute fast log base 2 ceiling

What is the fastest possible way to find out how many binary digits a particular integer has when it is converted from decimal to binary in C/C++?

Ex. 47(10) = 101111(2)

So 47 has 6 digits represented in binary.

Community
  • 1
  • 1
nyxz
  • 6,918
  • 9
  • 54
  • 67
  • 1
    You basically want to [compute ceil(log2(n))](http://stackoverflow.com/questions/3272424/compute-fast-log-base-2-ceiling), which has already been asked here. – In silico Jan 24 '12 at 17:20
  • 2
    _BitScanReverse (MSVC) or equivalent. Anything that calculates a floating point logarithm is instantly disqualified when asking for the fastest way. – harold Jan 24 '12 at 17:25
  • The fastest way will depend on the processor, but you can find gcc and MS solutions that will use a cpu instruction here: http://stackoverflow.com/questions/3272424/compute-fast-log-base-2-ceiling – Drew Dormann Jan 24 '12 at 17:27
  • This is not a duplicate of http://stackoverflow.com/questions/3272424/compute-fast-log-base-2-ceiling. The requested function transitions at different locations. It is, however, a possible duplicate of http://stackoverflow.com/questions/2721244/what-is-the-fastest-way-to-calculate-the-number-of-bits-needed-to-store-a-number. – Brent Bradburn Feb 20 '13 at 20:33

9 Answers9

11

For a quick fun way of doing this without needing to call math functions, check this one out:

for (digits = 0; val > 0; val >>= 1)
        digits++;

As a bonus, this should cook down to a memory load and 2 registers in use, for extra whiz-bang.

tbert
  • 2,089
  • 13
  • 14
  • I think this is the best answer so far.. – calebds Jan 24 '12 at 17:28
  • Nice and simple. With linear complexity. – nyxz Jan 24 '12 at 17:49
  • Too simple. OP said "fastest way", not "simplest" (is this really any simpler than using an intrinsic function anyway?) – harold Jan 24 '12 at 17:56
  • This is not the "fastest possible way" so there's a divide between the OP 's question and what he marked as the accepted answer. Maybe this is a language barrier or "fastest possible way" means the fastest to type rather the fastest with regards to execution?... – Louis Ricci Jan 24 '12 at 18:23
  • The OP wasn't too specific about what was meant by "fastest". This could easily be the fastest solution given the right dataset, computer architecture, and/or compile-time optimizations. It's definitely the fastest for me to comprehend. – Brent Bradburn Feb 20 '13 at 20:44
7

If you're looking for the "fastest" way in terms of performance, you will need to resort to platform-specific methods.

Some architectures actually have an instruction that does that.

On x86, you have the bsr instruction.

In MSVC, it is accessible as:

inline int bitlength(unsigned long x){
    if (x == 0)
        return 0;

    unsigned long index;
    _BitScanReverse(&index,x);
    return (int)(index + 1);
}

GCC has the __builtin_clz() intrinsic - which does something similar.

Mysticial
  • 464,885
  • 45
  • 335
  • 332
  • This is the answer that should be marked correct (also it came in a minute before mine). The BSR opcode is the fastest method for x86, in the general case the branching IF statements would likely be fastest. – Louis Ricci Jan 24 '12 at 18:26
  • Extra note for completeness, the branching method IFs that I noted in my post is also faster than the ORing method (in another answer), but slower than BSR instruction. snipt.org/xrom3 (312ms BSR, 406ms IFs, 671ms OR) I used FlatAssembler to benchmark the functions. – Louis Ricci Jan 24 '12 at 20:50
6

The fastest solution that's presented at my favorite collection of bit twiddling hacks is Find the log base 2 of an N-bit integer in O(lg(N)) operations with multiply and lookup. It requires 13 instructions to find the highest set bit in a number.

uint32_t v; // find the log base 2 of 32-bit v
int r;      // result goes here

static const int MultiplyDeBruijnBitPosition[32] = 
{
  0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
  8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
};

v |= v >> 1; // first round down to one less than a power of 2 
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;

r = MultiplyDeBruijnBitPosition[(uint32_t)(v * 0x07C4ACDDU) >> 27];
sblom
  • 26,911
  • 4
  • 71
  • 95
  • While I think this ORing method is cool (and as a programmer I can appreciate it artistically), it's not optimal. You doing 10 operations worth of OR and SHIFT, then your doing a MUL with another SHIFT, then your doing a memory look-up (LUT). It'd say this is as slow as the Log2 methods that were proposed. – Louis Ricci Jan 24 '12 at 18:17
  • @LastCoder What are you envisioning log2 to be doing behind the scenes? This implementation openly admits it needs 13 (non-FP) instructions. – sblom Jan 24 '12 at 18:28
  • If the target architecture has a fast cache then the LUT shouldn't be too expensive. – Sean U Jan 24 '12 at 18:37
  • 3
    Just measured on my Intel i7-based MacBook Pro. 1,000,000 iterations of ceil(log2(n)) took 33,132 microseconds, and 1,000,000 iterations of the bit-twiddling version took 8,410 microseconds--right around 75% faster. – sblom Jan 24 '12 at 19:30
  • How about the x87-FPU instruction FYL2XP1. This is what I expect Log2 to use on all x86 & x86-64 bit machines. It would put the number in ST(0) and it would put the constant 1 in ST(1), then execute FYL2XP1, which does ST0 = ST1 * log2( ST0 + 1.0 ), then you'd pop/store ST(0) as an integer again. So that would be 4 instructions total for the optimal FPU version. – Louis Ricci Jan 24 '12 at 19:35
  • Code published at https://github.com/sblom/sandbox/blob/master/try/cpp/binary_digit_count.cpp – sblom Jan 24 '12 at 19:41
  • @LastCoder, yes--4 instructions, but since they're FP instructions, any one may take longer than the entire 32-bit int algorithm. – sblom Jan 24 '12 at 19:42
  • @sblom - I wrote the 3 function (BSR, FPU, and OR) in assembly and benched them. I got the following results: (312ms BSR, 2949ms FPU, 686ms OR) using 0x7FFFFFF runs each. So, the OR method is 75% faster than FPU and the BSR method is 50% faster than OR. http://snipt.org/xrkn5 FASM was the assembler used. SO I WAS OVERZEALOUS/WRONG assuming that this OR method was as slow as FPU. – Louis Ricci Jan 24 '12 at 20:26
  • @sblom - Extra note for completeness, the branching method IFs that I noted in my post is also faster than the OR method, but slower than BSR instruction. http://snipt.org/xrom3 (312ms BSR, 406ms IFs, 671ms OR) – Louis Ricci Jan 24 '12 at 20:43
  • @LastCoder, Didn't even know the BSR instruction before today. That's extremely neat. Thanks for the knowledge boost! – sblom Jan 24 '12 at 21:30
4

The traditional way

int count = 32;
for(int i = 1 << 31; i != 0; i >>= 1, count--)
    if((number & i) != 0) return count;

You can get more fancy with optimization.

EDIT 2 Here's the fastest code I could think of without the use of Bit Scan Reverse opcode. You could use a bigger (256 entry) LUT and remove the last IF statement. In my testing this was faster than the repeated OR-SHIFT then LUT method described in another post.

int[] Log2_LUT = new int[16]{0, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4};
int Log2 (number) {
int count = 0;
if((number & 0xFFFF0000) != 0) { 
    number >>= 16;
    count += 16;
}
if((number & 0x0000FF00) != 0) { 
    number >>= 8;
    count += 8
}
if((number & 0x000000F0) != 0) {
    number >>= 4;
    count += 4;
}
return count + Log2_LUT[number];
}

Or if your in x86 or x86-64 bit architecture you can use the BSR (Bit Scan Reverse) opcode. You can find the c++ intrinsic for it http://msdn.microsoft.com/en-us/library/fbxyd7zd%28v=vs.80%29.aspx

Also you question is similar to this one What is the fastest way to calculate the number of bits needed to store a number

EDIT Why the log2 answers are not optimal... While mathematically correct, complex floating point operations (sine, cosine, tan, log) are the slowest performing operations on modern computers. This is compounded by having to convert integer to a float and having to ceiling/floor it as well.

Community
  • 1
  • 1
Louis Ricci
  • 20,804
  • 5
  • 48
  • 62
3

Try a base-2 logarithm:

ceil(log2(n))
Alex Reynolds
  • 95,983
  • 54
  • 240
  • 345
  • 3
    how about 1? Zero binary digits? – ElKamina Jan 24 '12 at 17:19
  • 4
    The OP is asking for *the fastest possible way*, and using floating-point operations like `log2()` and `ceil()` is certainly not the fastest way for this kind of problem. – In silico Jan 24 '12 at 17:24
2

If the integer is at least 1, the required bits would be:

floor(log2(x)) + 1
Drew Dormann
  • 59,987
  • 13
  • 123
  • 180
  • 1
    The OP is asking for *the fastest possible way*, and using floating-point operations like `log2()` and `ceil()` is certainly not the fastest way for this kind of problem. – In silico Jan 24 '12 at 17:26
  • Agreed. I didn't see that. Voting to close as dupe. – Drew Dormann Jan 24 '12 at 17:29
2

If speed is more important than portability, then some compilers provide a "count leading zeros" function. This compiles to a single machine instruction on some processors including modern x86 and ARM. For example, with GCC:

CHAR_BIT * sizeof x - __builtin_clz(x)
Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
1

try to use logarithms:

ceil(log2(x))
dfens
  • 5,413
  • 4
  • 35
  • 50
  • `floor` is wrong - using the example from the question, `log2(47) = 5.55`; taking the `floor` of that gives 5 instead of the correct 6. –  Jan 24 '12 at 17:15
  • 4
    To be mathematically nit-picky it's `floor(log2(x)) + 1`. – calebds Jan 24 '12 at 17:19
  • 2
    The OP is asking for *the fastest possible way*, and using floating-point operations like `log2()` and `ceil()` is certainly not the fastest way for this kind of problem. – In silico Jan 24 '12 at 17:26
  • `ceil(log2(x))` is incorrect. it should be `floor(log2(x))+1` – midi Nov 23 '19 at 09:12
0

One way is this...

unsigned int count_bits(unsigned int n)
{
    unsigned int count = 0;

    if ( n <= 1 )
      return 1;

    do
      count++;
    while( n >>= 1 );

    return count;
}
MetallicPriest
  • 29,191
  • 52
  • 200
  • 356