A way to find the nearest prime number to an unsigned long integer ( 32 bits wide ) in C?

Question

I'm looking for a way to find the closest prime number. Greater or less than, it doesn't matter, simply the closest ( without overflowing, preferably. ) As for speed, if it can compute it in approximately 50 milliseconds on a 1GHz machine ( in software, running inside Linux ), I'd be ecstatic.

How about having an array of all integer-range prime numbers? — MByD, Mar 08 '12 at 15:32
Well, depending on the number of primes from 0x0 to 0xFFFFFFFF, I guess that would be a most suitable way to do it. — Erkling, Mar 08 '12 at 15:33
Here is an algorithm for finding prime numbers, it builds from 2 up, http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes. — twain249, Mar 08 '12 at 15:34
@Binyamin: doable, but that's a pretty big table (about 200 million entries, I think). — Michael Burr, Mar 08 '12 at 15:35
There are [203280221 primes below 2^32](http://www.wolframalpha.com/input/?i=prime+numbers+less+than+2%5E32). Such a table would take up about 800MB. Is that too much? — R. Martinho Fernandes, Mar 08 '12 at 15:37
@Erkling Right, so you start at your specific number, you do a primality test on that number, incrementing it until you find one, and do the same by decrementing it, and pick the one of the possibly 2 that's closest — nos, Mar 08 '12 at 15:42
Consider it WAY too much, it's intended to be fairly small, since the machine this will be running on has 256MB of RAM. @nos I see, I guess that would be relatively suitable. — Erkling, Mar 08 '12 at 15:43

Brett Hale · Answer 1 · 2012-03-09T05:14:56.040

The largest prime gap in the range up to (2^32 - 1) is (335). There are (6542) primes less than (2^16) that can be tabulated and used to sieve successive odd values after a one-time setup. Obviously, only primes <= floor(sqrt(candidate)) need be tested for a particular candidate value.

Alternatively: The deterministic variant of the Miller-Rabin test, with SPRP bases: {2, 7, 61} is sufficient to prove primality for a 32-bit value. Due to the test's complexity (requires exponentiation, etc), I doubt it would be as fast for such small candidates.

Edit: Actually, if multiply/reduce can be kept to 32-bits in exponentiation (might need 64-bit support), the M-R test might be better. The prime gaps will typically be much smaller, making the sieve setup costs excessive. Without large lookup tables, etc., you might also get a boost from better cache locality.

Furthermore: The product of primes {2, 3, 5, 7, 11, 13, 17, 19, 23} = (223092870). Explicitly test any candidate in [2, 23]. Calculate greatest common divisor: g = gcd(u, 223092870UL). If (g != 1), the candidate is composite. If (g == 1 && u < (29 * 29)), the candidate (u > 23) is definitely prime. Otherwise, move on to the more expensive tests. A single gcd test using 32-bit arithmetic is very cheap, and according to Mertens' (?) theorem, this will detect ~ 68.4% of all odd composite numbers.

This style can be continued with the next 5 primes `{29, 31, 37, 41, 43}` whose product is `58642669`. etc. etc. But the number of primes in each set shrinks, as eventually the product will exceed 32-bits. — Jesse Chisholm, Dec 26 '20 at 01:45

Sumudu Fernando · Accepted Answer · 2012-03-09T05:45:59.340

UPDATE 2: Fixed (in a heavy-handed way) some bugs that caused wrong answers for small n. Thanks to Brett Hale for noticing! Also added some asserts to document some assumptions.

UPDATE: I coded this up and it seems plenty fast enough for your requirements (solved 1000 random instances from [2^29, 2^32-1] in <100ms, on a 2.2GHz machine -- not a rigorous test but convincing nonetheless).

It is written in C++ since that's what my sieve code (which I adapted from) was in, but the conversion to C should be straightforward. The memory usage is also (relatively) small which you can see by inspection.

You can see that because of the way the function is called, the number returned is the nearest prime that fits in 32 bits, but in fact this is the same thing since the primes around 2^32 are 4294967291 and 4294967311.

I tried to make sure there wouldn't be any bugs due to integer overflow (since we're dealing with numbers right up to UINT_MAX); hopefully I didn't make a mistake there. The code could be simplified if you wanted to use 64-bit types (or you knew your numbers would be smaller than 2^32-256) since you wouldn't have to worry about wrapping around in the loop conditions. Also this idea scales for bigger numbers as long as you're willing to compute/store the small primes up to the needed limit.

I should note also that the small-prime-sieve runs quite quickly for these numbers (4-5 ms from a rough measurement) so if you are especially memory-starved, running it every time instead of storing the small primes is doable (you'd probably want to make the mark[] arrays more space efficient in this case)

#include <iostream>
#include <cmath>
#include <climits>
#include <cassert>

using namespace std;

typedef unsigned int UI;

const UI MAX_SM_PRIME = 1 << 16;
const UI MAX_N_SM_PRIMES = 7000;
const UI WINDOW = 256;

void getSMPrimes(UI primes[]) {
  UI pos = 0;
  primes[pos++] = 2;

  bool mark[MAX_SM_PRIME / 2] = {false};
  UI V_SM_LIM = UI(sqrt(MAX_SM_PRIME / 2));
  for (UI i = 0, p = 3; i < MAX_SM_PRIME / 2; ++i, p += 2)
    if (!mark[i]) {
      primes[pos++] = p;
      if (i < V_SM_LIM)
        for (UI j = p*i + p + i; j < MAX_SM_PRIME/2; j += p)
          mark[j] = true;
      }
  }

UI primeNear(UI n, UI min, UI max, const UI primes[]) {
  bool mark[2*WINDOW + 1] = {false};

  if (min == 0) mark[0] = true;
  if (min <= 1) mark[1-min] = true;

  assert(min <= n);
  assert(n <= max);
  assert(max-min <= 2*WINDOW);

  UI maxP = UI(sqrt(max));
  for (int i = 0; primes[i] <= maxP; ++i) {
    UI p = primes[i], k = min / p;
    if (k < p) k = p;
    UI mult = p*k;
    if (min <= mult)
      mark[mult-min] = true;
    while (mult <= max-p) {
      mult += p;
      mark[mult-min] = true;
      }
    }

  for (UI s = 0; (s <= n-min) || (s <= max-n); ++s)
    if ((s <= n-min) && !mark[n-s-min])
      return n-s;
    else if ((s <= max-n) && !mark[n+s-min])
      return n+s;

  return 0;
  }

int main() {
  UI primes[MAX_N_SM_PRIMES];
  getSMPrimes(primes);

  UI n;
  while (cin >> n) {
    UI win_min = (n >= WINDOW) ? (n-WINDOW) : 0;
    UI win_max = (n <= UINT_MAX-WINDOW) ? (n+WINDOW) : UINT_MAX;

    if (!win_min)
      win_max = 2*WINDOW;
    else if (win_max == UINT_MAX)
      win_min = win_max-2*WINDOW;

    UI p = primeNear(n, win_min, win_max, primes);
    cout << "found nearby prime " << p << " from window " << win_min << ' ' << win_max << '\n';
    }
  }

You can sieve intervals in that range if you know primes up to 2^16 (there are only 6542 <= 2^16; you should go a bit higher if the prime itself could be greater than 2^32 - 1). Not necessarily the fastest way but very simple, and fancier prime testing techniques are really suited to much larger ranges.

Basically, do a regular Sieve of Eratosthenes to get the "small" primes (say the first 7000). Obviously you only need to do this once at the start of the program, but it should be very fast.

Then, supposing your "target" number is 'a', consider the interval [a-n/2, a+n/2) for some value of n. Probably n = 128 is a reasonable place to start; you may need to try adjacent intervals if the numbers in the first one are all composite.

For every "small" prime p, cross out its multiples in the range, using division to find where to start. One optimization is that you only need to start crossing off multiples starting at p*p (which means that you can stop considering primes once p*p is above the interval).

Most of the primes except the first few will have either one or zero multiples inside the interval; to take advantage of this you can pre-ignore multiples of the first few primes. The simplest thing is to ignore all even numbers, but it's not uncommon to ignore multiples of 2, 3, and 5; this leaves integers congruent to 1, 7, 11, 13, 17, 19, 23, and 29 mod 30 (there are eight, which map nicely to the bits of a byte when sieving a large range).

...Sort of went off on a tangent there; anyway once you've processed all the small primes (up till p*p > a+n/2) you just look in the interval for numbers you didn't cross out; since you want the closest to a start looking there and search outward in both directions.

If Brett Hale is correct about the largest gap then your `n` should be 335 or maybe a couple larger. — Mark Ransom, Mar 08 '12 at 16:34
Also I'd precompute the primes below 2^16 into a static table and use a binary search when `a` is small enough. — Mark Ransom, Mar 08 '12 at 16:36
The binary search is a good idea (I didn't say "static table" but that is indeed what I meant). I don't know how common the largest gap is so it might be better in the average case to use an n smaller than 335 then test adjacent intervals if needed (though quite possibly the difference between n = 128 and n = 512 would be negligible; I used this type of construction to build a general segmented sieve and found intervals of size ~20000 to be quite performant) — Sumudu Fernando, Mar 09 '12 at 02:11
I suspect the sieve will be the slowest part of this process, it would be a shame to need to do it twice on two different intervals. Better to be sure the first time. — Mark Ransom, Mar 09 '12 at 02:22
The input '3' tells me the nearest prime is '1', and '13' gives '23'. — Brett Hale, Mar 09 '12 at 05:27
Doh...I made some assumptions that are only true for big numbers. Will update with a fix in a sec. Thanks for checking! — Sumudu Fernando, Mar 09 '12 at 05:39
If the input is greater than 2^32-256 you could handle it with a few `if` statements. — Mark Ransom, Mar 09 '12 at 14:21
...I do handle it with a few if statements (all I meant in the description was that these tests could be removed if we didn't have to worry about those cases); this code should work for any input that fits in 32-bit unsigned int assuming I didn't make any other mistakes like what Brett Hale caught. — Sumudu Fernando, Mar 09 '12 at 21:47

A way to find the nearest prime number to an unsigned long integer ( 32 bits wide ) in C?

2 Answers2

Linked