5

Possible Duplicate:
Efficient bitwise operations for counting bits or find the right|left most ones

Is there a fast way to find the first 1 in a (32 bit) binary number?

e.g. if I have the number 00000000 00000000 00000000 10000000

I want to calculate the value "7" (or "24", if read from the other side) since the first zero in the number is stored in the 7th bit from the right.

Is there a faster way to do this than

int pos=0;
int number = 127; //binary from above
while ((number>>pos) > 0) { pos++; }

?

Perhaps a specific x86 assembler instruction?

Community
  • 1
  • 1
TravisG
  • 2,373
  • 2
  • 30
  • 47
  • 3
    are you assuming there is only ever one '1'? – corn3lius Nov 20 '12 at 15:58
  • 1
    First 1 from the left (from the MS bit) or first 1 from the right (from the LS bit) ? – Paul R Nov 20 '12 at 15:59
  • Well the answer is either `bsf` or `bsr` (or equivalent intrinsics) – harold Nov 20 '12 at 15:59
  • 6
    Why do you think you need to optimize this? Are you calling it millions of times? – Steve Wellens Nov 20 '12 at 15:59
  • 1
    Use one of the methods described here: [Bit Twiddeling Hacks](http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious) – Landei Nov 20 '12 at 16:00
  • 1
    @SteveWellens I do not know about the OP, but I actually have code where I call `bsr` every ~300 CPU cycles in scientific code that runs for hours. So it makes sense to think about this. – Sergey L. Nov 20 '12 at 17:25
  • @Steve Wellens, yes, it's part of an entropy decoder and that specific algorithm runs twice per color component per pixel of an image, which takes about 3% off of the performance of the whole decoder (tested using profilers). If there's a not-too-complex faster way to do it, it would be an easy 3% performance gain. – TravisG Nov 21 '12 at 07:51
  • @TravisG - Then it's worth doing. Carry on. – Steve Wellens Nov 21 '12 at 14:18

5 Answers5

5

With gcc, you can use __builtin_ctz and __builtin_clz. ctz gives the number of trailing 0 bits, and clz gives the number of leading 0-bits.

William Pursell
  • 204,365
  • 48
  • 270
  • 300
3

Yes, there is a faster way -- without using macros.

With your method, you can have up to 32*2 ops.

Here is a logarithmic algorithm.

First, you split the number in 2 shorts, and check for 0 the lower short. If 0, you go and check the high part, keeping offset=16. If not 0, go and check the low part, with offset = 0. You will remain with a short and an offset

Next, split in 2 chars the remaining part, and proceed the same. You will remain with a char and an offset.

Next, split the char in 2 parts of 4 bits, and check the same.

You will make maximum log 32 * 2 operations.

alinsoar
  • 15,386
  • 4
  • 57
  • 74
  • While that is true, based on the last line of his question I seriously doubt it's the answer he's looking for. – harold Nov 20 '12 at 16:17
3

You are looking for the bit scan instructions of x86

__inline__ size_t bsf(size_t input) {
    size_t pos;
    __asm__ ("bsf %1, %0" : "=r" (pos) : "rm" (input));
    return pos;
}

If using inline asm make sure that both pos and input are the same storage class (2, 4, or 8 byte integer types). This inline function should be no problem.

Most compilers have intrinsics that use this instruction, but MSVC is the only one i know that has the direct one.

For the highest order bit set use bsr instruction instead, same syntax.

NOTE: if input is 0 (no bits set) then the result is undefined!

Here is a version that will put a predefined constant into pos if the input is 0:

#define BIT_SCAN_IFZERO 0

__inline__ size_t bsf(size_t input) {
    size_t pos, ifzero = BIT_SCAN_IFZERO;
    __asm__ ( "bsf %1, %0\n\t"
              "cmovz %2, %0"
            : "=r" (pos)
            : "rm" (input)
            , "rm" (ifzero));
    return pos;
}

Define BIT_SCAN_IFZERO to whatever you like. If you want a negative number there then change size_t to ssize_t (signed size type)

Sergey L.
  • 21,822
  • 5
  • 49
  • 75
  • The fact that it's undefined when input is 0 is pretty uncomfortable. I have to introduce branches around the code to check for that possibility. Is there a way to get around this? – TravisG Nov 21 '12 at 09:59
  • @TravisG added a version that will put a predefined constant in. It is a conditional move, no need for a branch. – Sergey L. Nov 21 '12 at 10:21
1

The MSVC intrinsics are _BitScanForward and _BitScanReverse.

You can look up their arguments here, they're not overly intuitive (the value you want is not the return value).

harold
  • 61,398
  • 6
  • 86
  • 164
-2

That is pretty much the fastest way to do it. If you we're dealing with the potential of multiple words you could test the first word then second so on and find the lowest not 0 word. You will still have to shift it though.

That Guy
  • 31
  • 1