3

I have a uint64_t and I would like to find the index of the first set bit, reset it to zero and find the next set bit.

How do I know when to terminate? BSF on all zeros is undefined...

const uint64_t input = source;

if(0 != input){

    int32_t setIndex = GCC_BSF_INTRINSIC(input);

    while(setIndex != UNDEFINED???){

        //Do my logic

        //Reset
        input[setIndex] = 0;

        setIndex = BSF_Variant(input);
    }
}

Could somebody please help?

user997112
  • 29,025
  • 43
  • 182
  • 361

2 Answers2

2

The simplest would be to just check the input:

while (input) {
    int32_t index = __builtin_ffsll(input);
    // do stuff
}

More complicatedly, according to the docs the docs:

— Built-in Function: int __builtin_ffs (int x)
Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero.

Which lets you do:

for (int index  = __builtin_ffsll(input); 
     index; 
     index = __builtin_ffsll(input))
{
    // do stuff
}

Which accomplishes the same thing, you just have to repeat the __builtin_ffsll call, so it's more verbose and in my opinion doesn't contribute to clarity.

Barry
  • 286,269
  • 29
  • 621
  • 977
0

2 points to keep in mind when using __builtin_ffs:

  1. in order to get the next bit, you need to clear the recently found bit
  2. if you are planning to use the result, for bit shifting or table indexing, you would most likely need to decrease it by one.
while (m) {
    // Get the rightmost bit location. 

    int BitOffset = __builtin_ffs(m);

    // Clear the bit before the next iteration. 
    // Used in the loop condition.

    m = (m >> BitOffset) << BitOffset;

    // Do your stuff .....
}
Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
  • A more efficient way to clear the lowest set bit is `m &= m-1;`, which an compile to [`blsr`](https://www.felixcloutier.com/x86/blsr), or with BMI1 to LEA / AND (2 uops) instead of two variable-count shifts (6 uops total on Intel and forcing the compiler to use ECX). – Peter Cordes Sep 04 '21 at 22:15
  • Intel users are lucky for having intrinsic-s and it's makes sense. on the DSP platform I was working on this way was faster and the assembly "interleaved" with the code in the loop. – Rami Gideoni Sep 30 '21 at 03:00