2
POPCNT_MULT = 0x0000000000002000000000100000000008000000000400000000020000000001;
POPCNT_MASK = 0x0001041041041041041041041041041041041041041041041041041041041041;
POPCNT_MODULO = 0x3F;

number of bits turned on(1) in "num" = ((("num" * POPCNT_MULT) & POPCNT_MASK) % POPCNT_MODULO)

example :

num = 3 ( 3 = 0011 )

number of bits turned on(1) in "num" = (((3 * POPCNT_MULT) & POPCNT_MASK) % POPCNT_MODULO) = 2


hello! While studying algorithms, I found an algorithm with the above formula.

I want to study the above algorithm, but I don't know what the algorithm of the formula refers to...

It may be Hammingweight, but it may not be,,,

phuclv
  • 37,963
  • 15
  • 156
  • 475
imelon2
  • 123
  • 4

1 Answers1

3

The first number, POPCNT_MULT, serves to repeat the initial value, concatenated bitwise. Another way of viewing it:

0x0000000000002000000000100000000008000000000400000000020000000001 = 
  (0x2000000000000000000000000000000000000000000000000000
 + 0x100000000000000000000000000000000000000000
 + 0x8000000000000000000000000000000
 + 0x400000000000000000000
 + 0x20000000000
 + 0x1)

So the last part is just 1 times the value, 0x20000000000 is just a 1 followed by a bunch of zeros in binary, so the product is going to be the same just shifted left a bunch of bits, and so on for the remaining terms. There are enough padding zeros to ensure that the terms do no overlap while adding.

The next number, POPCNT_MASK, serves mask off all but every 6th bit of the product. To see why, take the magic repeated value 0x041, and write it in binary: 000001000001. The number of zeros of padding in the previous section was also chosen such that the masking preserves each bit of the individual value in exactly one of its repeated occurrences. In a way, we've effectively fanned out the value, interleaving spans of 5 consecutive zeros, and then permuted the order of the original bits.

Finally, we take the result modulo POPCNT_MODULO. This is where the real magic happens. Since all the bits of the original number are now spread out every 6 bits, they have effectively been multiplied by various powers of 2^6, or 64. If we take the bit at the ith position in the modified value (just after masking) to be b_i, then we can express the number as:

n = (b_0 * 64^0) + (b_1 * 64^1) + (b_2 * 64^2) + ...

If we take the result modulo 0x3F, or 63, we get our final expression:

                                               n = POPCNT mod 63
(b_0 * 64^0) + (b_1 * 64^1) + (b_2 * 64^2) + ... = POPCNT mod 63

We can simplify by reducing each coefficient mod 63: 64 = 1 mod 63:

(b_0 * 1^0) + (b_1 * 1^1) + (b_2 * 1^2) + ... = POPCNT mod 63
 b_0         + b_1         + b_2        + ... = POPCNT mod 63

Its then pretty clear why this result equals the POPCOUNT of the original value- its just summing the bits, taking the result mod 63. As long as the number of set bits is less than 63 (which its guaranteed to be), the result reduced modulo 63 equals the sum of bits outright.

Dillon Davis
  • 6,679
  • 2
  • 15
  • 37
  • 1
    Where was it guaranteed that the number of set bits is less than 63? – Kelly Bundy Feb 28 '23 at 03:35
  • 1
    @KellyBundy good point- I notice now that the original post nor my answer clarifies data types. This algorithm was designed for 32-bit integers- the zero padding during the multiplication stage only adds 40 bits of spacing after all. This means that at most 32 bits can be set – Dillon Davis Feb 28 '23 at 07:07
  • Yes, I think that would be good to state at the start. With that knowledge, I'll now try to fully read and understand it:-) – Kelly Bundy Feb 28 '23 at 11:50
  • Ok, I get it. Now the question is: Who is interested in 32-bit numbers but is able to do operations with 241-bit numbers? – Kelly Bundy Feb 28 '23 at 12:37
  • Thank you so much for sharing your great knowledge. It's been of such great help to me. – imelon2 Feb 28 '23 at 14:36