Time complexity finding n primes with trial division by all preceding primes

Question

Problem : Finding n prime numbers.

#include<stdio.h>
#include<stdlib.h>

void firstnprimes(int *a, int n){

    if (n < 1){
        printf("INVALID");
        return;
    }

    int i = 0, j, k;                // i is the primes counter

    for (j = 2; i != n; j++){       // j is a candidate number

        for (k = 0; k < i; k++)
        {
            if (j % a[k] == 0)      // a[k] is k-th prime
                break;
        }

        if (k == i)                 // end-of-loop was reached
            a[i++] = j;             // record the i-th prime, j
    }
    return;
}

int main(){
    int n;
    scanf_s("%d",&n);
    int *a = (int *)malloc(n*sizeof(int));
    firstnprimes(a,n);
    for (int i = 0; i < n; i++)
        printf("%d\n",a[i]);
    system("pause");
    return 0;
}

My function's inner loop runs for i times (at the most), where i is the number of prime numbers below a given candidate number, and the outer loop runs for (nth prime number - 2) times.

How can I derive the complexity of this algorithm in Big O notation?

Thanks in advance.

Aside: you don't need to check *all* the previous primes as factors - only those <= square root of candidate. Such as the limit `a[k] * a[k] <= j;` Moreover there is no need to check any even number. — Weather Vane, Jun 21 '17 at 19:40
... with the exception of `2` which is the only even prime. So having initialised the array with `2` the outer loop can be `for (j = 3; i != n; j+=2)` — Weather Vane, Jun 21 '17 at 19:46

score 1 · Answer 1 · answered Jun 21 '17 at 19:39

The prime number theorem states that asymptotically, the number of primes less than n is equal to n/log n. Therefore, your inner loop will run Theta of i * max =n / log n * n times (assuming max=n).

Also, your outer loop runs on the order of n log n times, making the total complexity Theta of n / log n * n * n log n = n^3. In other words, this is not the most efficient algorithm.

Note that there are better approximations around (e.g. the n-th prime number is closer to:

n log n + n log log n - n + n log log n / log n + ...

But, since you are concerned with just big O, this approximation is good enough.

Also, there are much better algorithms for doing what you're looking to do. Look up the topic of pseudoprimes, for more information.

I mean that my inner loop runs for a maximum of i times. – Sai Shankar Jun 22 '17 at 01:51 — Sai Shankar, Jun 22 '17 at 01:51

Will Ness · Accepted Answer · 2017-06-25T22:10:07.220

In pseudocode your code is

firstnprimes(n) = a[:n]   # array a's first n entries
    where
       i = 0
       a = [j for j in [2..]  
                if is_empty( [j for p in a[:i] if (j%p == 0)] )
                   && (++i) ]

(assuming the short-circuiting is_empty which returns false as soon as the list is discovered to be non-empty).

What it does is testing each candidate number from 2 and up by all its preceding primes.

Melissa O'Neill analyzes this algorithm in her widely known JFP article and derives its complexity as O( n^2 ).

Basically, each of the n primes that are produced is paired up with (is tested by) all the primes preceding it (i.e. k-1 primes, for the k th prime) and the sum of the arithmetic progression 0...(n-1) is (n-1)n/2 which is O( n^2 ); and she shows that composites do not contribute any term which is more significant than that to the overall sum, as there are O(n log n) composites on the way to n th prime but the is_empty calculation fails early for them.

Here's how it goes: with m = n log n, there will be m/2 evens, for each of which the is_empty calculation takes just 1 step; m/3 multiples of 3 with 2 steps; m/5 with 3 steps; etc.

So the total contribution of the composites, overestimated by not dealing with the multiplicities (basically, counting 15 twice, as a multiple of both 3 and 5, etc.), is:

SUM{i = 1, ..., n} (i m / p_i)          // p_i is the i-th prime
= m SUM{i = 1, ..., n} (i / p_i)
= n log(n) SUM{i = 1, ..., n} (i / p_i)
< n log(n) (n / log(n))                 // for n > 14,000 
= n^2

The inequality can be tested at Wolfram Alpha cloud sandbox as Sum[ i/Prime[i], {i, 14000}] Log[14000.0] / 14000.0 (which is 0.99921, and diminishing for bigger n, tested up to n = 2,000,000 where it's 0.963554).

in what language? C? -- simplify, or optimize to improve its speed? the latter will need *more complex* code (to add one or two new tests). or the algorithm may be completely changed, specifically from your trial division, to the [tag:sieve-of-eratosthenes]. there's lots and lots of Q&A on that in [tag:primes] on SO. — Will Ness, Jun 25 '17 at 22:07

Time complexity finding n primes with trial division by all preceding primes

2 Answers2

Linked