3

Edit: Problem solved, advert your eyes from the below code though, haha, I had some things conceptually wrong from the beginning.

I've been trying for longer than I care to admit to count the non-zero bits in a wide variety of 32-bit integers, but always seem a bit off.

So for example, the correct amount of non-zero bits for the numbers:

13676 9190872 136669 -17621 -1631 -183 15570 0 495 468656377 -1340216 -91

Would be:

8 12 10 26 25 27 8 0 8 18

However, I always end up a few digits off on at least a few (especially the negative numbers) no matter what I try.

Currently I'm using:

def bitCount():
    numbers = input().split()
    answer = []
    for num in numbers:
        data = bin(((1 << 32) + int(num)) & -5)
        bitCount = 0
        for char in data[3:]:
            if char == '1':
                bitCount += 1
        answer.append(str(bitCount))
    print(' '.join(answer))
bitCount()

Which for the above set of numbers generates: 7 12 9 25 24 26 8 0 7 18 19 26

What am I doing wrong? How would I go about solving this problem? I can't imagine it is too complex, but after hours of searching, reading, and experimenting I feel like I must just be missing something obvious. Any help would be greatly appreciated.

Vale
  • 1,003
  • 1
  • 9
  • 22
  • Why are you doing `data[3:]`? You only want to strip the first _two_ characters off a `bin` representation, not the first three. But really, why even create the `bin` representation with a `0b` just to pull them off? If you want to use strings, why not `format(n, 'b')`? – abarnert May 11 '15 at 01:04
  • Anyway, whatever list of "C bit-twiddling tricks" you got that `((1 << 32) + int(num)) & -5)` from, (a) it doesn't work in Python because Python has arbitrary-length ints, not fixed-bit-size ints, (b) you copied it wrong, and (c) it probably isn't going to be faster than doing it the obvious way in Python, and even if it were, both versions would be too slow for a tight inner loop and more than fast enough for anything else, so who cares? – abarnert May 11 '15 at 01:13

6 Answers6

3

Your whole approach doesn't make sense in Python.

The number of 1 bits in -17621, aka -100010011010101 binary, is 7. If you're expecting 26, you're not asking for the number of 1 bits in the number, you're asking for the number of 1 bits in the C 32-bit 2's complement signed integer representation of the number interpreted as a C 32-bit unsigned integer. If you want that in Python, you have to ask for it explicitly. For example, you can use num % 0x100000000.

Meanwhile, whatever list of bit-twiddling tricks you got the (1 << 32) + num & -5 from, it's also assuming C int32 math rather than actual integer arithmetic, so it's going to be wrong again. Plus, I'm pretty sure you copied it wrong. Plus, there's no reason for these kinds of tricks in Python—odds are it'll actually be slower, rather than faster, but either way, it's going to be way, way too slow to do a zillion times in a tight loop, and more than fast enough to do a few times, so this kind of squeeze-out-the-last-5% optimization is far more pointless in Python than in C. (Although many of these old tricks actually slow things down in C as well, with modern compilers and CPUs…)

And knocking off the initial 1 bit by doing [3:] again assumes that you've got a 32-bit representation, which is again wrong. (That's why all your positive numbers were off by 1—you're knocking off the first 1 bit.)

So, let's just make it simple:

answer = []
for num in numbers:
    data = int(num) % 0x100000000
    bits = format(data, 'b')
    answer.append(str(bits.count('1')))
print(' '.join(answer))

Notice that I used format(data, 'b'), so I don't have to knock off the 0b at the beginning.

Or, if you want to make it more compact:

print(' '.join(str(format(int(num)%0x100000000, 'b').count('1')) for num in numbers))

Either way, you get:

8 12 10 26 25 27 8 0 8 18 20 28

… which does have two more numbers than your expected output, but then your input also had two more numbers than your expected output, so I think that's to be expected.

abarnert
  • 354,177
  • 51
  • 601
  • 671
2

To coax python to convert to two's complement you can apply a bitmask of 32 bits of 1's (i.e 0xFFFFFFFF) to n. The result is n if n is positive. However, If n is negative the bitmask operation returns a positive number that has all of the same bits as the two's complement representation of the negative number n. Thus in either case the bit count is what you expect.

From there you can simply count the bits that come out as a result of the "bin" function.

[bin(n & 0xFFFFFFFF).count("1") for n in numbers]

The result for your example input is

[8, 12, 10, 26, 25, 27, 8, 0, 8, 18, 20, 28]

In addition to giving you the answer you expect, this other question suggests that bin(n).count is the fastest way to find the hamming weight for an integer

Community
  • 1
  • 1
Alfred Rossi
  • 1,942
  • 15
  • 19
1

Try this:

#!/usr/bin/env python                                                           


def bit_counter(number):                                                        

    suffix = 2                                                                  
    if number < 0:                                                              
        suffix = 3                                                              

    bit = bin(number)                                                           
    counter = 0                                                                 
    for i in bit[suffix:]:                                                      
        if int(i) == 1:                                                         
            counter += 1                                                        

    return counter                                                              


def main():                                                                     
    a = 13676                                                                   
    print bit_counter(a)                                                        
    a = -13676                                                                  
    print bit_counter(a)                                                        


if __name__ == "__main__":                                                      
    main()          

It works for negative and non negative numbers.

Khamidulla
  • 2,927
  • 6
  • 35
  • 59
1

The number of bits set to 1 in an integer number is a known as its Hamming weight or population count.

There are several fast algorithms to perform such a calculation. Have a look at the following sources:

  1. Bit array article on Wikipedia
  2. Bit Twiddling Hack
  3. Answers to Stackoverflow question #109023
  4. Bit Manipulation page on Python Wiki.
  5. The Aggregate Magic Algorithms (optimized for 32-bit integers).

My preferred generic algorithms is:

def bit_count(n):
    """Return the number of bits set to 1 in the integer number 'n'.
       This is called the Hamming weight or the population count of 'n'.
       The algorithm performs as many iterations as there are set bits.
       References:
         The C Programming Language 2nd Ed., Kernighan & Ritchie, 1988.
         Peter Wegner in CACM 3 (1960), 322.
    """
    assert n >= 0, 'Argument of bit_count() must be non-negative'
    count = 0
    while n:
        n &= n - 1
        count += 1
    return count

For 32-bit non-negative integers, the following function is faster:

def bit_count_parallel_32(n):
    """Return the number of set bits (1) present in the integer number 'n'.
       This algorithm accepts only 32-bit non-negative integer numbers.
    """
    assert 0 <= n < 2**32, ('Argument of bit_count_parallel_32() must be '
        'non-negative and less than %d (2**32)') % 2**32
    n = n - ((n >> 1) & 0x55555555)
    n = (n & 0x33333333) + ((n >> 2) & 0x33333333)
    return (((n + (n >> 4) & 0x0F0F0F0F) * 0x01010101) & 0xffffffff) >> (8 + 16)

In case someone needs it, here is the version for 64-bit non-negative integers:

def bit_count_parallel_64(n):
    """Return the number of set bits (1) present in the integer number 'n'.
       This algorithm accepts only 64-bit non-negative integer numbers.
    """
    assert 0 <= n < 2**64, ('Argument of bit_count_parallel_64() must be '
        'non-negative and less than %d (2**64)') % 2**64
    n = n - ((n >> 1) & 0x5555555555555555)
    n = (n & 0x3333333333333333) + ((n >> 2) & 0x3333333333333333)
    return (((n + (n >> 4) & 0x0F0F0F0F0F0F0F0F) * 0x0101010101010101) \
        & 0xffffffffffffffff) >> (8 + 16 + 32)

Note however that all those algorithms do not work with negative integer numbers. If you need to use negative numbers, you can use the previous functions to count the numbers of bits set in the 2-complement representation of the negative number, e.g.:

>>> bit_count(-3 & 0xFF)
7

Have a look also to the site devoted to the great book Hacker's Delight (Wayback Machine).

novog
  • 121
  • 11
davidedb
  • 867
  • 5
  • 12
0

I think you're making this much harder than you need to. I don't understand what the bin((1 << 32) + int(num) & -5) is supposed to do. As you've noticed, bin(number) returns the binary representation of number. So now all you have to do is count the 1's. So this will work:

for num in numbers:
    data = bin(num)
    bitCount = 0
    for char in data:
        if char == '1':
            bitCount += 1
Oliver Dain
  • 9,617
  • 3
  • 35
  • 48
0

The simplest way is to use gmpy:

import gmpy
[gmpy.popcount(n & 0xFFFFFFFF) for n in numbers]
#>>> [8, 12, 10, 26, 25, 27, 8, 0, 8, 18, 20, 28]

This will also likely be quite fast compared to string operations.

Another method is to use Numpy:

import numpy

def count_bits_u32(u32):
  u32 = (u32 & 0x55555555) + ((u32 & 0xAAAAAAAA) >> 1)
  u32 = (u32 & 0x33333333) + ((u32 & 0xCCCCCCCC) >> 2)
  u32 = (u32 & 0x0F0F0F0F) + ((u32 & 0xF0F0F0F0) >> 4)
  u32 = (u32 & 0x00FF00FF) + ((u32 & 0xFF00FF00) >> 8)
  u32 = (u32 & 0x0000FFFF) + ((u32 & 0xFFFF0000) >> 16)
  return u32

count_bits_u32(numpy.array(numbers, dtype="uint32"))
#>>> array([ 8, 12, 10, 26, 25, 27,  8,  0,  8, 18, 20, 28], dtype=uint32)

This is likely to be more efficient with large lists of values, since it packs the integers and uses vectorized operations.

Veedrac
  • 58,273
  • 15
  • 112
  • 169