14

How can I get the number of "1"s in the binary representation of a number without actually converting and counting ?

e.g.

  def number_of_ones(n):
      # do something
      # I want to MAKE this FASTER (computationally less complex).
      c = 0
      while n:
        c += n%2
        n /= 2
      return c


>>> number_of_ones(5)
    2
>>> number_of_ones(4)
    1
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Pratik Deoghare
  • 35,497
  • 30
  • 100
  • 146
  • Is this a duplicate of http://stackoverflow.com/questions/843737/count-bits-in-the-number-closed – ChrisW May 09 '09 at 19:01
  • @ChrisW- python and c are two different lanaguages – TStamper May 09 '09 at 19:03
  • 1
    It's not an exact duplicate. bitwise operations in python are much more expensive that c. – Nadia Alramli May 09 '09 at 19:04
  • Python integers already *are* binary, that's why it makes sense to talk about their hamming weight / popcount. [Count the number of set bits in a 32-bit integer](https://stackoverflow.com/a/109069) notes that Python since 3.10 has `int.bit_count()` – Peter Cordes Dec 05 '22 at 15:10

9 Answers9

12

I'm not a python programmer, but hopefully it will be enough for you to follow.

c = 0
while n:
    c += 1
    n &= n - 1

return c

While a little obscure, it's primary advantage is speed and simplicity. The while loop is only iterated once for every bit set to 1 in n.

Prashant Kumar
  • 20,069
  • 14
  • 47
  • 63
Stephen Nutt
  • 3,258
  • 1
  • 21
  • 21
7

You cannot make this computationally less complex. It will be O(n) the number of bits, or, as the answer with the & trick showed, O(n) the number of bits set to 1; but unless all of the numbers you are using are a special case, the latter should on average be n/2, so both of those O(n) numbers are the same.

And the lookup-table trick, of course, is actually doing nothing for the computational complexity; it's just paying for time with space but without changing the underlying economics, which are that you must examine each bit once somehow and there is no way around that. You cannot, logically, answer a question about the bits in the number without inspecting each of them.

Now, I suppose I'm being a bit sloppy since many of these examples are actually O(n^2) since in Python you have to examine the whole number at once time, so with a Python long integer of, say, 100 bytes, a + or an & or a / operation will look at each byte at least once, and that will happen over and over until the number is reduced to zero (in the schemes outlined above), so these, again, are really O(n^2) operations. I am not sure Python will allow a true O(n) solution here.

Anyway: if you were really asking about computational complexity, which specifically means big-O analysis, that's your answer. :-)

Brandon Rhodes
  • 83,755
  • 16
  • 106
  • 147
5

IMO, a good approach would be to use a look-up table - create a dictionary which converts bytes to number of 1's (you can use the code you posted to generate it, it would only need to run once), and then use something like this:

def number_of_ones(n):
    sum = 0
    while n != 0:
        sum += lookup_table[n & 0xff]
        n >>= 8
    return sum

I believe this is a fairly good trade-off between space and running-time.

drrlvn
  • 8,189
  • 2
  • 43
  • 57
5

If you want to do it in a single line, you could use:

sum( [x&(1<<i)>0 for i in range(32)] )
tom10
  • 67,082
  • 10
  • 127
  • 137
3

https://en.wikipedia.org/wiki/Hamming_weight.

Solution with O(log(n)), n is bit length

m1  = 0x5555555555555555 #binary: 0101...
m2  = 0x3333333333333333 #binary: 00110011..
m4  = 0x0f0f0f0f0f0f0f0f #binary:  4 zeros,  4 ones ...
m8  = 0x00ff00ff00ff00ff #binary:  8 zeros,  8 ones ...
m16 = 0x0000ffff0000ffff #binary: 16 zeros, 16 ones ...
m32 = 0x00000000ffffffff #binary: 32 zeros, 32 ones
h01 = 0x0101010101010101 #the sum of 256 to the power of 0,1,2,3...

def hammingWeight(x: int) -> int:
    x -= (x>>1)&m1
    x = (x&m2) + ((x>>2)&m2)
    x = (x+(x>>4))&m4
    return ((x*h01)>>56)&0x7f

And in python 3.10, the int type will have a bit_count() method

oxfn
  • 6,590
  • 2
  • 26
  • 34
  • Python integers are arbitrary precision; this will popcount the low 64 bits. Which might be fine for some use-cases. Obviously `bit_count()` if *much* better for both clarity and performance. – Peter Cordes Dec 05 '22 at 15:14
1

Here:

def bitCount(int_no):

    c = 0
    while(int_no):
        int_no &= (int_no - 1)
        c += 1
    return c

This might be an old and efficient way to do this... originally implementated in C (Algo has a name I can't remember). It works fine for me hope it does for any other person

Siong Thye Goh
  • 3,518
  • 10
  • 23
  • 31
Wells
  • 81
  • 4
1

If you're actually concerned about speed, code it up in C (see this question for how), and interface with the C implementation using something like ctypes.

Community
  • 1
  • 1
Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
0
p = lambda n:n and 1+p(n&(n-1))

This uses short-circuiting and recursion, when n is greater than 0, it switches to calculate 1+p(n&(n-1)) where p(n&(n-1)) is called and so on, when n is 0 it returns 0. Complexity being O(n) since it runs the number of times the number of ones exist in the binary.

Example: p(9) = 1+p(8) = 1+1+p(0) = 1+1+0=2

Michael M.
  • 10,486
  • 9
  • 18
  • 34
kanine
  • 958
  • 1
  • 8
  • 7
0

I came here today to check the state of the art.

This is a subject come up from time to time, modern CPUs have a the population count algorithm. This is useful to calculate Bit Error Rate in certain communications application. This is the hamming weight, related to Hamming distance, scipy has an implementation but it works with arrays, not with the binary representation of numbers. For fixed size computation, e.g numpy arrays, the fastest approach I know is the method this answer, however the mentioned answer, I am calling this the divide and conquer method. In the general case the divide and conquer has complexity O(log(n)) instead of O(n). The accepted answer is really good for most cases, and even better in languages such as C where you could simply take ((*char)bits)[i], instead of shifting the number.

Here I give a general implementation for the divide and conquer approach where the masks are computed dynamically depending on the size of the input number.

def dq_number_of_ones(n):
  nb = 1
  while n >> nb:
    nb *= 2
  t = (1 << nb) - 1
  masks = []
  while nb > 1:
    nb //= 2
    t ^= t >> nb
    masks.append(t)
  
  t = n
  s = 1
  for tm in masks[::-1]:
    tm = masks.pop()
    t = ((tm & t) >> s) + ((tm >> s) & t)
    s *= 2
  return t

And for completenes here the OP's method and the accepted lookup table method

def number_of_ones(n):
    # do something
    # I want to MAKE this FASTER (computationally less complex).
    c = 0
    while n:
      c += n%2
      n //= 2
    return c
lookup_table = [number_of_ones(i) for i in range(256)]
def lt_number_of_ones(n):
    sum = 0
    while n != 0:
        sum += lookup_table[n & 0xff]
        n >>= 8
    return sum

A practical comparison of the two

import time
import matplotlib.pyplot as plt
x = []
t1 = []
t2 = []
t3 = []
for i in range(3,8000,20):
  y = i**i
  if i < 1000:
    time.sleep(0.0001)
    t0 = time.time()
    w1 = number_of_ones(y)
    t1.append(time.time() - t0)
  else:
    t1.append(np.nan)
  time.sleep(0.0001)
  t0 = time.time()
  w2 = dq_number_of_ones(y)
  t2.append(time.time() - t0)
  time.sleep(0.0001)
  t0 = time.time()
  _ = lt_number_of_ones(y)
  t3.append(time.time() - t0)
  time.sleep(0.0001)
  
  x.append(i)

plt.figure(figsize=(12, 4))
plt.subplot(121)
plt.plot(x, t1)
plt.plot(x, t2)
plt.plot(x, t3)
plt.xlim([10,None])
plt.title('Linear scale')
plt.ylabel('time to count bits in $x^x$')
plt.subplot(122)
plt.loglog(x, t1)
plt.loglog(x, t2)
plt.loglog(x, t3)
plt.xlim([10,None])
plt.title('Log scale')
plt.legend(['direct counting', 'lookup table', 'divide and conquer'])

enter image description here

Bob
  • 13,867
  • 1
  • 5
  • 27
  • The strategy with fixed-width masks like `0x5555555555555555` would best be described as an SWAR bithack. (As explained in [Count the number of set bits in a 32-bit integer](https://stackoverflow.com/a/109025)). I guess divide and conquer does kind of fit, though, as you divide into 2-bit accumulators that you widen. That version only works for 64-bit integers and narrower, though. One could check for the size being larger in a program that sometimes uses huge integers. I assume these are all horribly slow compared to the built-in `bit_count()` which hopefully makes these all obsolete. – Peter Cordes Dec 05 '22 at 15:22
  • 1
    Counting iterations of `n &= n - 1` (usually attributed to Kernighan) is another algorithm worth looking at that's good for numbers with only a few set bits. – Peter Cordes Dec 05 '22 at 15:23
  • This [Kerninghan](https://en.wikipedia.org/wiki/Brian_Kernighan)? – Bob Dec 06 '22 at 16:21
  • 1
    Yes; an answer on [Count the number of set bits in a 32-bit integer](https://stackoverflow.com/a/37558380) quotes http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan which attributes it to *C Programming Language 2nd Ed.* in exercise 2-9 (by K & R). But also that it "was first published by Peter Wegner in CACM 3 (1960), 322". Still, as I said, it's often called Kernighan's method, perhaps because of Sean Anderson's attribution of it. When I commented yesterday, I hadn't checked on the history. – Peter Cordes Dec 06 '22 at 16:28
  • Nice info :) you must be a very annoying person!!! hehe – Bob Dec 06 '22 at 16:34