1

E.g. For the input 5, the output should be 7. (bin(1) = 1, bin(2) = 10 ... bin(5) = 101) --> 1 + 1 + 2 + 1 + 2 = 7

Here's what I've tried, but it isn't a very efficient algorithm, considering that I iterate the loop once for each integer. My code (Python 3):

i = int(input())
a = 0
for b in range(i+1):
  a = a + bin(b).count("1")
print(a)

Thank you!

Peyton
  • 53
  • 1
  • 7
  • Please check this question : https://stackoverflow.com/questions/9829578/fast-way-of-counting-non-zero-bits-in-positive-integer – Ahmad Dec 09 '17 at 16:01
  • Because the count operation of each number is independent you can use thread level parallelism to speed up up your work. – Ahmad Dec 09 '17 at 16:03
  • The number of ones in the binary representations of 1 to n for values of n is sequence A000788 - some possibly useful formulae here: http://oeis.org/A000788 – Spacedman Dec 09 '17 at 16:05
  • @Ahmad The core of this loop is CPU bound, the GIL will block the threads. – pstatix Dec 09 '17 at 16:05
  • @Ahmad there's likely to be a pattern to the answer that can be exploited so that you don't need to iterate all the values. – Mark Ransom Dec 09 '17 at 16:05
  • According to [OEIS A000788](https://oeis.org/A000788)'s formulas section, a recursive solution that doesn't require counting `"1"`s in a string is a(0) = 0, a(2n) = a(n)+a(n-1)+n, a(2n+1) = 2a(n)+n+1. – Chai T. Rex Dec 09 '17 at 16:06
  • 1
    Furthermore, to follow up with @ChaiT.Rex, `a(2n)` represents the formulae for an even number, `a(2n+1)` represents the formulae for an odd number. – pstatix Dec 09 '17 at 16:13

2 Answers2

6

Here's a solution based on the recurrence relation from OEIS:

def onecount(n):
    if n == 0:
        return 0
    if n % 2 == 0:
        m = n/2
        return onecount(m) + onecount(m-1) + m
    m = (n-1)/2
    return 2*onecount(m)+m+1

>>> [onecount(i) for i in range(30)]
[0, 1, 2, 4, 5, 7, 9, 12, 13, 15, 17, 20, 22, 25, 28, 32, 33, 35, 37, 40, 42, 45, 48, 52, 54, 57, 60, 64, 67, 71]
Spacedman
  • 92,590
  • 12
  • 140
  • 224
  • It might be more efficient to remove the `n == 1` check, since most inputs will have to unsuccessfully check that, slowing down most inputs (and most recursive calls). Also, bitwise stuff like `if n & 1 == 0` and `m = n >> 1` (in front of the `if` to handle both cases) will likely speed things up a tiny bit. – Chai T. Rex Dec 09 '17 at 16:18
  • Yes, true. I spotted the n==1 just now. Was writing for readability first. – Spacedman Dec 09 '17 at 16:20
  • Why do you perform the calculation for `m`? Such as the `n/2` and `(n-1)/2`? – pstatix Dec 09 '17 at 16:21
  • Would it benefit by caching previously *calculated* results? – wwii Dec 09 '17 at 16:31
  • `m` is what's passed to the function recursively. Like with a factorial function where `f(x)` is `x * f(x-1)`. Here `f(N) = 2*f((N-1)/2) + (N-1)/2 + 1` for odd N. – Spacedman Dec 09 '17 at 16:31
  • 1
    @wwii memoization can be useful in some cases, but I didn't fancy going beyond basic python at this point. Feel free to write a memoize decorator and see what its effect is. – Spacedman Dec 09 '17 at 16:34
  • Using [lru_cache](https://docs.python.org/3/library/functools.html#functools.lru_cache) takes the time for onecount(2000) down from 70 microseconds to 100 nanoseconds on my machine. – chthonicdaemon Dec 09 '17 at 17:40
  • I would claim that this might be done considerably faster using gmpy2. See the other answer. – Bill Bell Dec 09 '17 at 17:44
  • @chthonicdaemon - and with the cache it *grows* at a different rate relative to the *size* of the n. – wwii Dec 10 '17 at 01:03
1

gmpy2, due to Alex Martella et al, seems to perform better, at least on my Win10 machine.

from time import time
import gmpy2

def onecount(n):
    if n == 0:
        return 0
    if n % 2 == 0:
        m = n/2
        return onecount(m) + onecount(m-1) + m
    m = (n-1)/2
    return 2*onecount(m)+m+1

N = 10000

initial = time()
for _ in range(N):
    for i in range(30):
        onecount(i)
print (time()-initial)

initial = time()
for _ in range(N):
    total = 0
    for i in range(30):
        total+=gmpy2.popcount(i)
print (time()-initial)

Here's the output:

1.7816979885101318
0.07404899597167969

If you want a list, and you're using >Py3.2:

>>> from itertools import accumulate
>>> result = list(accumulate([gmpy2.popcount(_) for _ in range(30)]))
>>> result
[0, 1, 2, 4, 5, 7, 9, 12, 13, 15, 17, 20, 22, 25, 28, 32, 33, 35, 37, 40, 42, 45, 48, 52, 54, 57, 60, 64, 67, 71]
Bill Bell
  • 21,021
  • 5
  • 43
  • 58