42

Can someone explains why Brian Kernighan's algorithm takes O(log N) to count set bits (1s) in an integer. A simple implementation of this algorithm is below (in JAVA)

int count_set_bits(int n){
    int count = 0;
    while(n != 0){
        n &= (n-1);
        count++;
    }
    return count;
}

I understand how it works by clearing the rightmost set bit one by one until it becomes 0, but I just don't know how we get O(log N).

t. fochtman
  • 431
  • 3
  • 9
peter
  • 8,333
  • 17
  • 71
  • 94

3 Answers3

33

This algorithm goes through as many iterations as there are set bits. So if we have a 32-bit word with only the high bit set, then it will only go once through the loop. In the worst case, it will pass once per bit. An integer n has log(n) bits, hence the worst case is O(log(n)). Here's your code annotated at the important bits (pun intended):

  int count_set_bits(int n){
        int count = 0; // count accumulates the total bits set 
        while(n != 0){
            n &= (n-1); // clear the least significant bit set
            count++;
        }
  }
PengOne
  • 48,188
  • 17
  • 130
  • 149
  • 1
    When you say 'n has log(n) bits' do you mean 'significant bits' or just bits. I would assume the number of total bits used to represent an integer would remain constant for a given platform. – Quest Monger Jul 08 '15 at 20:30
  • 3
    so I think even if we count the bit one by one, it is still `O(log n)`. Brian Kernighan's algorithm only improve on the average case or best case: if we assume half of the bits are `1`, then the loop is half many times as bit by bit... if the number has just 1 bit that is set, then instead of 32 or 64 times (or whenever that bit is cleared and making `a` become `0`, for example: loop 6 times if it is the 6th bit that is set), then Brian Kernighan's algorithm is just one time through the loop. If only 2 bits are `1`, then Brian Kernighan's algorithm will loop through twice only. – nonopolarity Feb 12 '17 at 17:14
  • 1
    @太極者無極而生 more compactly expressed: The naive solution is `theta(log n)` whereas Kernighans is `O(log n)`. – andreee Feb 21 '19 at 23:43
8

There are floor(lg(N)) + 1 significant bits in N -- that's a base-2 logarithm. The number of 1 bits in n is at most this. So the time will have asymptotic upper bound O(lg(N)) = O(log(N)).

The_Sympathizer
  • 1,191
  • 9
  • 17
5

This question is really about meaning of N in big O notation, not complexity of algorithm.

N means size of the data. But in case where where "data" is a single number you need to define what you understand as size of data. The value of a number or length of it's representation.

IMO the algorithm is O(N). Because in this problem of counting 1 in binary representation IMO relevant size of data is length of the number's representation, not it's value i.e. the length of the bit stream. And obvious worst case is all 1's taking N iterations.

But if you consider value of N as size of the data, it's representation has log(N) length so you can say it's O(log(N))

PS Also big O notation makes sense only if you generalise algorithm for arbitrarily high Ns. In this code N is limited by int size, so you can say it's O(1) as it will be at most 64 loop iterations (for 64 bit ints)

IvanSanchez
  • 18,272
  • 3
  • 30
  • 45