0

My teacher gave me an assignment like this: Using the number n given, find the largest prime number p with p<=n and n<=10^9. I tried doing this by using the following function:

Const amax=1000000000
Var i,j,n:longint;
    a:array [1..amax] of boolean;
Function lp(n:longint):longint;
 Var max:longint;
 Begin
  For i:=1 to n do a[i]:=true;
  For i:=2 to round(sqrt(n)) do
   If (a[i]=true) then
    For j:=1 to n div i do
     If (i*i+(j-1)*i<=n) then
      a[i*i+(j-1)*i]:=false;
  max:=0;
  i:=n;
  While max=0 do
   Begin
    If a[i]=true then max:=i;
    i:=i-1;
   End;
  lp:=max;
 End;

This code worked flawlessly for numbers such as 1 million, but when i tried n=10^9, the program took a long time to print the output. So here's my question: Are there any ways to improve my code for lower delay? Or maybe a different code?

jub0bs
  • 60,866
  • 25
  • 183
  • 186
Helscarthe
  • 11
  • 3

1 Answers1

2

The most important aspect here is that the greatest prime that is not greater than n must be fairly close to n. A quick look at The Gaps Between Primes (at The Prime Pages - always worth a look for everything to do with primes) shows that for 32-bit numbers the gaps between primes cannot be greater than 335. This means that the greatest prime not greater than n must be in the range [n - 335, n]. In other words, at most 336 candidates need to be checked - for example via trial division - and this is bound to be lots faster than sieving a billion numbers.

Trial division is a reasonable choice for tasks of this kind, because the range to be scanned is so small. In my answer to Prime sieve implementation (using trial division) in C++ I analysed a couple of ways for speeding it up.

The Sieve of Eratosthenes is also a good choice, it just needs to be modified to sieve only the range of interest instead of all numbers from 1 to n. This is called a 'windowed sieve' because it sieves only a window. Since the window will most likely not contain all the primes up to the square root of n (i.e. all the primes that could be potential least prime factors of composites in the range to be scanned) it is best to sieve the factor primes via a separate, simple Sieve of Eratosthenes.

First I'm showing a simple rendition of normal (non-windowed) sieve, as a baseline for comparing the windowed code to. I'm using C# in order to show the algorithm more clearly than would be possible with Pascal.

List<uint> small_primes_up_to (uint n)
{
    if (n == uint.MaxValue)
        throw new ArgumentOutOfRangeException("n", "n must be less than UINT32_MAX");

    var eliminated = new bool[n + 1];  // +1 because indexed by numbers

    eliminated[0] = true;
    eliminated[1] = true;

    for (uint i = 2, sqrt_n = (uint)Math.Sqrt(n); i <= sqrt_n; ++i)
        if (!eliminated[i])
            for (uint j = i * i; j <= n; j += i)
                eliminated[j] = true;

    return remaining_unmarked_numbers(eliminated, 0);
}

The fuction has 'small' in its name because it is not really suited for sieving big ranges; I use similar code (with a few bells and whistles) only for sieving the small factor primes needed by more advanced sieves.

The code for extracting the sieved primes is equally simple:

List<uint> remaining_unmarked_numbers (bool[] eliminated, uint sieve_base)
{
    var result = new List<uint>();

    for (uint i = 0, e = (uint)eliminated.Length; i < e; ++i)
        if (!eliminated[i])
            result.Add(sieve_base + i);

    return result;
}

Now, the windowed version. One difference is that the potential least factor primes need to be sieved separately (by the function just shown) as explained earlier. Another difference is that the starting point of the crossing-off sequence for a given prime may lie outside the range to be sieved. If the starting point lies before the start of the window then a bit of modulo magic is necessary to find the first 'hop' that lands in the window. From then on everything proceeds as usual.

List<uint> primes_between (uint m, uint n)
{
    m = Math.Max(m, 2);

    if (m > n)  
        return new List<uint>();  // empty range -> no primes

    // index overflow in the inner loop unless `(sieve_bits - 1) + stride <= UINT32_MAX`
    if (n - m > uint.MaxValue - 65521)  // highest prime not greater than sqrt(UINT32_MAX)
        throw new ArgumentOutOfRangeException("n", "(n - m) must be <= UINT32_MAX - 65521");

    uint sieve_bits = n - m + 1;
    var eliminated = new bool[sieve_bits];

    foreach (uint prime in small_primes_up_to((uint)Math.Sqrt(n)))
    {
        uint start = prime * prime, stride = prime;

        if (start >= m)
            start -= m;
        else
            start = (stride - 1) - (m - start - 1) % stride;

        for (uint j = start; j < sieve_bits; j += stride)
            eliminated[j] = true;
    }

    return remaining_unmarked_numbers(eliminated, m);
}

The two '-1' terms in the modulo calculation may seem strange, but they bias the logic down by 1 to eliminate the inconvenient case stride - foo % stride == stride that would need to be mapped to 0.

With this, the greatest prime not exceeding n could be computed like this:

uint greatest_prime_not_exceeding (uint n)
{
    return primes_between(n - Math.Min(n, 335), n).Last();
}

This takes less than a millisecond all told, including the sieving of the factor primes and so on, even though the code contains no optimisations whatsoever. A good overview of applicable optimisations can be found in my answer to prime number summing still slow after using sieve; with the techniques shown there the whole range up to 10^9 can be sieved in about half a second.

Community
  • 1
  • 1
DarthGizka
  • 4,347
  • 1
  • 24
  • 36