I was curious to see whether or not MSVC used the compiler intrinsic __popcnt for bitset::count
.
Looking around, I found this to be the implementation for std::bitset::count
for VS2017:
size_t count() const _NOEXCEPT
{ // count number of set bits
const char *const _Bitsperbyte =
"\0\1\1\2\1\2\2\3\1\2\2\3\2\3\3\4"
"\1\2\2\3\2\3\3\4\2\3\3\4\3\4\4\5"
"\1\2\2\3\2\3\3\4\2\3\3\4\3\4\4\5"
"\2\3\3\4\3\4\4\5\3\4\4\5\4\5\5\6"
"\1\2\2\3\2\3\3\4\2\3\3\4\3\4\4\5"
"\2\3\3\4\3\4\4\5\3\4\4\5\4\5\5\6"
"\2\3\3\4\3\4\4\5\3\4\4\5\4\5\5\6"
"\3\4\4\5\4\5\5\6\4\5\5\6\5\6\6\7"
"\1\2\2\3\2\3\3\4\2\3\3\4\3\4\4\5"
"\2\3\3\4\3\4\4\5\3\4\4\5\4\5\5\6"
"\2\3\3\4\3\4\4\5\3\4\4\5\4\5\5\6"
"\3\4\4\5\4\5\5\6\4\5\5\6\5\6\6\7"
"\2\3\3\4\3\4\4\5\3\4\4\5\4\5\5\6"
"\3\4\4\5\4\5\5\6\4\5\5\6\5\6\6\7"
"\3\4\4\5\4\5\5\6\4\5\5\6\5\6\6\7"
"\4\5\5\6\5\6\6\7\5\6\6\7\6\7\7\x8";
const unsigned char *_Ptr = &reinterpret_cast<const unsigned char&>(_Array);
const unsigned char *const _End = _Ptr + sizeof (_Array);
size_t _Val = 0;
for ( ; _Ptr != _End; ++_Ptr)
_Val += _Bitsperbyte[*_Ptr];
return (_Val);
}
It looks like its using lookup tables to get the number of bits for any given byte, and then counts the number of 1's for every byte.
According to this answer here, GCC implements it like this (along the lines of what I was thinking):
/// Returns the number of bits which are set.
size_t
count() const { return this->_M_do_count(); }
size_t
_M_do_count() const
{
size_t __result = 0;
for (size_t __i = 0; __i < _Nw; __i++)
__result += __builtin_popcountl(_M_w[__i]);
return __result;
}
Although I didn't benchmark anything, I would bet good money that GCC's implementation would be quite a bit faster here.
Hence, are there any compelling reason why MSVC implemented std::bitset::count
like this? My guess is either MSVC has a catch-all "no compiler intrinsics in STL" policy, or there's a difference between the two platforms that I'm overlooking.