1

What is the fastest way to count all set bits?
The number from which I am counting bits is from 1<=n<2^63, only positive values.

For now I use bitset from the library and it is pretty fast, but if there is any faster option I would like to know.

In the worst case scenario I have over 1 billion iterations, so I am looking a way to speed it up.
This is the part of my loop where I am counting the setted bits:

if (std::bitset<64>(currentNumber).count() == numofOnes)
++counter;
Yunnosch
  • 26,130
  • 9
  • 42
  • 54
MrRobot
  • 43
  • 1
  • 8
  • 4
    Maybe [`std::popcount()`](https://en.cppreference.com/w/cpp/numeric/popcount)? (Experiments will be needed) – MikeCAT Dec 21 '20 at 23:00
  • 2
    If you don't have `std::popcount()` from C++20, see https://stackoverflow.com/questions/109023/how-to-count-the-number-of-set-bits-in-a-32-bit-integer/109025#109025 – Nate Eldredge Dec 21 '20 at 23:02
  • 2
    Does this answer your question? [How to count the number of set bits in a 32-bit integer?](https://stackoverflow.com/questions/109023/how-to-count-the-number-of-set-bits-in-a-32-bit-integer) - there's a 64-bit answer provided there, even though the title says 32 bits. – Ken Y-N Dec 21 '20 at 23:10
  • 3
    To be fair that question is getting dated. Nowadays all compilers support popcnt intrinsics in a way or another, the ugly bithack that is the first answer is no longer useful and might even be slower. — so if you do follow that link, scroll past that one. – spectras Dec 21 '20 at 23:13
  • 4
    I am not a fan of "fastest" questions. In order to illustrate why and that the question could be improved, let me put this answer for discussion: The fastest way is either a lookup table (not for the full range but hierarchically balanced) or a hardware-supported bit counting engine. Now please explain why these two options are not an answer for you. – Yunnosch Dec 21 '20 at 23:21
  • 1
    I quite like the idea of a look-up table. Hard to beat it for speed. Take up a wee bit of storage though. – user4581301 Dec 21 '20 at 23:25
  • The __builtin_popcount is slower option than bitset https://stackoverflow.com/questions/60165922/stdbitsetncount-vs-builtin-popcount I will have to use lookup table, thanks. – MrRobot Dec 21 '20 at 23:28
  • Lookup table solved my question, thanks for the advice. – MrRobot Dec 21 '20 at 23:36
  • @user4581301 Size of lookup table is scalable, or can be "balanced". – Yunnosch Dec 21 '20 at 23:59
  • @Yunnosch Yup but they're slower than the full lookup table. Asker wanted fastest, not fastest with a reasonable implementation. – user4581301 Dec 22 '20 at 00:00
  • @MrRobot: If you see the comments on the answer you linked, the "slower" result is likely an artifact of the test, since the actual assembly code is the same for both. – Nate Eldredge Dec 22 '20 at 00:01
  • Hamming Weight is not much storage extensive also works pretty fast. – MrRobot Dec 22 '20 at 00:03
  • 1
    @user4581301 Actually I wnated to provoke by mentioning the full size lookup table (obviously not an answer) and then could not resist to wisenheimer about the possibility to scale it and balance speed vs. size. Ah, well, turned out a decent answer. – Yunnosch Dec 22 '20 at 00:03

1 Answers1

2

Here is a demo of what I mean with a balanced lookup table.
This one is heavy on the size saving.
You can adapt the concept to lookup tables of size e.g. 64 or 256, speeding up considerably.

#include <iostream>

using namespace std;

int main()
{
   unsigned long long int input=679043ULL; // just a big number, for demo
   
   unsigned char lookup[16]={0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4};
   unsigned char result=0;

   while (input>0)
   {
       cout << (unsigned int) result << " " << (input&15) << " " << input << endl;
       result+=lookup[input&15];
       input>>=4;
   }
   cout << (unsigned int)result << endl;
   return 0;
}

Output:

0 3 679043
2 8 42440
3 12 2652
5 5 165
7 10 10
9

The demo outputs shows that the loop accumulates
2 bits in "3",
1 bit in "8",
2 bits in "12",
2 bits in "5",
two bits in "10";
for a total of 9.

user4581301
  • 33,082
  • 7
  • 33
  • 54
Yunnosch
  • 26,130
  • 9
  • 42
  • 54