Calculating Hamming Weight in O(1)

Question

In binary representation, hamming weight is the number of 1's. I came across web and found an O(1) answer to it:

v = v - ((v>>1) & 0x55555555);
v = (v & 0x33333333) + ((v>>2) & 0x33333333);
int count = ((v + (v>>4) & 0xF0F0F0F) * 0x1010101) >> 24;

However I don't quite understand the algorithm and cannot find a description of it anywhere. Can someone please explain it a little bit especially the last line (what the heck does *0x1010101 and then >> 24 mean)?

Side-note: It isn't actually `O(1)`. It's `O(log(n))` to the number of bits whereas a straightforward loop is `O(n)`. For a fixed integer size, both this and a straight-forward loop are `O(1)`. — Mysticial, Mar 05 '13 at 20:11
@Mysticial Yeah, If we consider a 32-bit int, they are both O(1). But this one should be faster than the iteration counting, right? — NSF, Mar 05 '13 at 21:27
@NSF - yes, it's likely faster than iteration counting (although it does have a multiplication, so...), but that's not what Big-Oh analysis is about. Big-Oh refers to how quickly does runtime grow as a function of the size of the input. In Big-Oh, the linear count is O(n), and this algorithm is O(log(n)), which is indeed better asymptotically, but sometimes it might take a particular value of n for an asymptotically faster algorithm to perform better. — Dylan McNamee, Sep 16 '19 at 17:13
I'm not sure this even counts as `O(log(n))`. It's `O(log(n))` *timewise*, yes, but it accomplishes this by requiring `O(n)` parallelization, in a way that adds up to `O(n)` total computing power (which will become obvious if you run this algorithm on an entire [large] file). — Brilliand, Jun 02 '20 at 19:39

score 32 · Accepted Answer · edited Sep 18 '15 at 20:43

This is part of a divide-and-conquer strategy for counting bits, called a "population" function. The scholarly treatment of this strategy can be found in Reingold and Nievergelt, 1977.

The idea is to first sum the bits pairwise, then 4-wise, then 8-wise and so on. For example, if you have the bits 1011, then the first pair 10 becomes 01 because there is one bit and the second becomes 10 because 10 = 2 in binary and there are two bits in 11. The essential fact here is that:

population(x) = x - (x/2) - (x/4) - (x/8) - (x/16) - ... etc.

The exact algorithm you have is a variant of what is known as the "HAKMEM" algorithm (see Beeler, Gosper and Schroppel, 1972). This algorithm counts 1's in 4-bit fields in parallel, then these sums are converted into 8-bit sums. The last step is an operation to add these 4 bytes by multiplying by 0x01010101. The 0x0F0F0F0F mask gets the 4-wise bytes sums by masking out non-sum information. For example, lets say the 8-wise field is 10110110, then there are 5 bits which is equal to 0101, thus we will have 10110101. Only the last four bits are significant, so we mask out the first four, ie:

10110101 & 0x0F = 00000101

You can find an entire chapter on the minutiae of counting bits in the book "Hacker's Delight" by Henry Warren.

Calculating Hamming Weight in O(1)

1 Answers1

Linked