5

Here's the implementation of std::bitset::count with MSVC 2010:

size_t count() const
    {   // count number of set bits
    static char _Bitsperhex[] = "\0\1\1\2\1\2\2\3\1\2\2\3\2\3\3\4";
    size_t _Val = 0;
    for (int _Wpos = _Words; 0 <= _Wpos; --_Wpos)
        for (_Ty _Wordval = _Array[_Wpos]; _Wordval != 0; _Wordval >>= 4)
            _Val += _Bitsperhex[_Wordval & 0xF];
    return (_Val);
    }

Can someone explain to me how this is working? what's the trick with _Bitsperhex?

Christian Rau
  • 45,360
  • 10
  • 108
  • 185
lezebulon
  • 7,607
  • 11
  • 42
  • 73
  • In general, tagging C and C++ is not a good idea unless the question specifically relates to both. This one looks like it's a C++ question (judging by its use of classes, specifically bitset, which does not exist in C) – Wug Sep 07 '12 at 19:22
  • 3
    @Wug : I tagged it C because the implementation is coded pretty much in C – lezebulon Sep 07 '12 at 19:23
  • From the asm output, looks like current MSVC uses an 8-bit version of this (256-entry table). There seems to be no way to tell it that the hardware `popcnt` instruction is available. (I tried `/Ox /arch:AVX`, which implies SSE4.2 support and thus also popcnt, but the asm output for `std::bitset<32> bs(x); return bs.count()` was still pretty crap: store to the stack and reload it 1 byte at a time, same as with just `/Ox`. https://godbolt.org/g/sGgoSD) gcc and clang use `popcnt` for `std::bitset::count()`. See https://stackoverflow.com/a/109069/224132 for more. – Peter Cordes Sep 22 '17 at 11:49

2 Answers2

9

_Bitsperhex contains the number of set bits in a hexadecimal digit, indexed by the digit.

digit: 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
value: 0    1    1    2    1    2    2    3    1    2    2    3    2    3    3    4
index: 0    1    2    3    4    5    6    7    8    9    A    B    C    D    E    F

The function retrieves one digit at a time from the value it's working with by ANDing with 0xF (binary 1111), looks up the number of set bits in that digit, and sums them.

Wug
  • 12,956
  • 4
  • 34
  • 54
  • ok so basically the inner loop will loop as many times as there are 4 bits in a "word"? – lezebulon Sep 07 '12 at 19:25
  • 2
    @lezebulon, the inner loop is actually quite clever - it relies on right shifting to remove 4 bits at a time, and as soon as there are no bits left the loop exits. If the initial value is zero the loop doesn't even run. – Mark Ransom Sep 07 '12 at 19:27
4

_Bitsperhex is a 16 element integer array that maps a number in [0..15] range to the number of 1 bits in the binary representation of that number. For example, _Bitsperhex[3] is equal to 2, which is the number of 1 bits in the binary representation of 3.

The rest is easy: each multi-bit word in internal array _Array is interpreted as a sequence of 4-bit values. Each 4-bit value is fed through the above _Bitsperhex table to count the bits.

It is a slightly different implementation of the lookup table-based method described here: http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable. At the link they use a table of 256 elements and split 32-bit words into four 8-bit values.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • The code shown doesn't have any notion of 8-bit words. It just keeps shifting the native words of the bitset (whatever those are, most probably 32/64-bit uints) by 4. So your description doesn't really explain what this code actually does, even if generally correct. – Christian Rau Sep 07 '12 at 19:31