Why is using all numbers to test for a prime number faster than using prime numbers only

Question

I made this program for generating prime numbers. I know there are lots of formulas for generating them 100x faster, but this is what I did.

I tried to divide i with all numbers under i. That was the simplest method, but I though it was inefficient, since after dividing by 2 you don't need to divide by 4 and so on.
I made a list of prime numbers smaller than i, and divided i by that list's numbers. I went through the list using std::iterator, because I saw it being used in all stackoverflow answers and other tutorials. It turned out to be a lot slower. Like it took 22 seconds instead of 2.
I tried to use an int to go through the list, and it took 2 seconds again.

Next, I used 1 000 000 to see the difference between method 1 and 3. To my amazement method 1 was faster. Why is that? Shouldn't using only prime numbers to test be faster than using all numbers?

#include <iostream>
#include <vector>
#include <chrono>

int main()
{
    std::cout << "how high do you want to generate prime numbers? ";
    int x;
    // typed 1 000 000
    std::cin >> x;
    auto starttime = std::chrono::high_resolution_clock::now();
    std::vector<unsigned int> primes;
    bool isPrime;
    for (int i = 2; i <= x; ++i) {
        isPrime = true;

        // takes 293 seconds
        //for (int div{ 2 }; div < i; ++div) {
            //  if ((i % div) == 0) {

        // takes really really long
        //for (std::vector<unsigned int>::iterator div = primes.begin(); div != primes.end(); ++div) {
            //if ((i % *div) == 0) {

        // takes 356 seconds
        for (int iter = 0; iter < primes.size(); ++iter) {
            if ((i % primes[iter]) == 0) {

                isPrime = false;
                break;
            }
        }
        if (isPrime) {
            primes.push_back(i);
            std::cout << i << " ";
        }
    }
    std::cout << "generating prime numbers up to " << x << " took " <<
        round(static_cast<std::chrono::duration<double>>((std::chrono::high_resolution_clock::now() - starttime)).count())
        << " seconds.";
}

My bet (I can't check at the moment) is that using a vector spends more time on memory access than is saved on integer modulo. Note you only need check divisors up to the square root of i, inclusive. This saves a lot of time. — Uri Raz, Nov 24 '19 at 17:09
Survey of techniques with performance timing: https://stackoverflow.com/a/5694432/576911 — Howard Hinnant, Nov 24 '19 at 17:32
In the for-loop, add another check before the end: `if (primes[iter] * primes[iter] > i) break;` — Eljay, Nov 24 '19 at 17:36
RAM is slow, compared to the CPU. It's probably faster for the CPU to crunch through the numbers brute force than to take a coffee break to wait for the RAM bus to arrive with its passenger. — Sam Varshavchik, Nov 24 '19 at 18:01

score 0 · Accepted Answer · answered Nov 24 '19 at 17:10

0

its because of usage vector<unsinged int> for third method. especially primes.push_back which leads to allocations. Try to primes.reserve initially

answered Nov 24 '19 at 17:10

Alex

1,047
8
21

How much? Could I make it dynamic, depending on x? – discape Nov 24 '19 at 17:20
if you don`t care about memory usage primes.reserve(x) is enough of course. otherwise for large numbers I would find some statistics for prime numbers – Alex Nov 24 '19 at 17:23
I did primes.reserve(x / log10(x / 1.08366)); – discape Nov 24 '19 at 18:31
also reserving slots took 30 sec – discape Nov 24 '19 at 19:36

score 0 · Answer 2 · answered Nov 24 '19 at 18:16

I'd say the main issue is that most frequently a number is divisible by 2 and thus it isn't prime. I suppose the first method is more friendly with compiler and cache. But it is hard to tell for sure. Also, try to remove printing and test for time spent. Printing tends to slow the code a lot depending on usage.

A standart method for determining all prime numbers (there are more efficient ones but this one is fairly simple).

Create vector A of boolean that will indicate whether the number is prime or not. But at start set all variables to true - except A[0]=A[1]=false.
Run for loop from i = 2 to x. If A[i] is false then skip it - i is not prime. If A[i] is true then i is prime and set all A[i*k] to false for all 1<k<x/i.

This should be more efficient either of the methods.

Why is using all numbers to test for a prime number faster than using prime numbers only

2 Answers2