I would like to ask a question that can be summarized in one sentence:
What could be explanation for the performance difference between using
vector<bool>
(slow) and usingvector<uint8_t>
(fast)? (compiled with g++ 8.2 -O1 with C++17 standard)
Here is the full story.
It starts with LeetCode's Count Primes problem. Count the number of primes less than n.
One solution is to apply the Sieve of Eratosthenes, whose C++ code I will list below.
However, I interestedly noticed that with only one difference -- the vector
type, there is a consistent gap in performance. Namely,
| type | runtime | memory |
|-----------------|---------|---------|
| vector<bool> | 68 ms | 8.6 MB |
| vector<uint8_t> | 24 ms | 11.5 MB |
The full code is
class Solution
{
public:
int countPrimes(int n)
{
// vector<uint8_t> isprime(n, true);
vector<bool> isprime(n, true);
int res = 0, root = std::sqrt(n);
for (int i = 2; i < n; ++i) {
if (isprime[i]) {
++res;
if (i <= root)
for (int k = i * i; k < n; k += i)
isprime[k] = false;
}
}
return res;
}
};
For your reference, LeetCode has the following note for C++ compilation
Compiled with g++ 8.2 using the latest C++ 17 standard.
Your code is compiled with level one optimization (-O1). AddressSanitizer is also enabled to help detect out-of-bounds and use-after-free bugs.
Most standard library headers are already included automatically for your convenience.
My question is: what could be the explanation for the observation that using vector<bool>
is much slower (68 ms, 60% percentile) than using vector<uint8_t>
(24 ms, 95% percentile)?
I repeated the experiment for more than 10 times. The difference in memory usage is also consistent. If you have any explanation for that (vector<uint8_t>
uses more memory than vector<bool>
), I also would like to hear.
Thank you very much!