2

What is an efficient way for working with large prime numbers with Python? You search on here or on google, and you find many different methods for doing so... sieves, primality test algorithms... Which ways work for larger primes?

bnlucas
  • 1,724
  • 1
  • 13
  • 18
  • How large are we talking? 10^6? 10^100? 10^1000? – Rushy Panchal Jun 25 '13 at 13:13
  • If you view the answer below, `10^301` worked just fine in under 3 seconds. It took 15.1s on my system to get the next prime higher than `10^500` using the methods below. Haven't gone higher. – bnlucas Jun 25 '13 at 13:17

3 Answers3

12

For determining if a number is a prime, there a sieves and primality tests.

# for large numbers, xrange will throw an error.
# OverflowError: Python int too large to convert to C long
# to get over this:

def mrange(start, stop, step):
    while start < stop:
        yield start
        start += step

# benchmarked on an old single-core system with 2GB RAM.

from math import sqrt

def is_prime(num):
    if num == 2:
        return True
    if (num < 2) or (num % 2 == 0):
        return False
    return all(num % i for i in mrange(3, int(sqrt(num)) + 1, 2))

# benchmark is_prime(100**10-1) using mrange
# 10000 calls, 53191 per second.
# 60006 function calls in 0.190 seconds.

This seems to be the fastest. There is another version using not any that you see,

def is_prime(num)
    # ...
    return not any(num % i == 0 for i in mrange(3, int(sqrt(num)) + 1, 2))

However, in the benchmarks I got 70006 function calls in 0.272 seconds. over the use of all 60006 function calls in 0.190 seconds. while testing if 100**10-1 was prime.

If you're needing to find the next highest prime, this method will not work for you. You need to go with a primality test, I have found the Miller-Rabin algorithm to be a good choice. It is a little slower the Fermat method, but more accurate against pseudoprimes. Using the above mentioned method takes +5 minutes on this system.

Miller-Rabin algorithm:

from random import randrange
def is_prime(n, k=10):
    if n == 2:
        return True
    if not n & 1:
        return False

    def check(a, s, d, n):
        x = pow(a, d, n)
        if x == 1:
            return True
        for i in xrange(s - 1):
            if x == n - 1:
                return True
            x = pow(x, 2, n)
        return x == n - 1

    s = 0
    d = n - 1

    while d % 2 == 0:
        d >>= 1
        s += 1

    for i in xrange(k):
        a = randrange(2, n - 1)
        if not check(a, s, d, n):
            return False
    return True

Fermat algoithm:

def is_prime(num):
    if num == 2:
        return True
    if not num & 1:
        return False
    return pow(2, num-1, num) == 1

To get the next highest prime:

def next_prime(num):
    if (not num & 1) and (num != 2):
        num += 1
    if is_prime(num):
        num += 2
    while True:
        if is_prime(num):
            break
        num += 2
    return num

print next_prime(100**10-1) # returns `100000000000000000039`

# benchmark next_prime(100**10-1) using Miller-Rabin algorithm.
1000 calls, 337 per second.
258669 function calls in 2.971 seconds

Using the Fermat test, we got a benchmark of 45006 function calls in 0.885 seconds., but you run a higher chance of pseudoprimes.

So, if just needing to check if a number is prime or not, the first method for is_prime works just fine. It is the fastest, if you use the mrange method with it.

Ideally, you would want to store the primes generated by next_prime and just read from that.

For example, using next_prime with the Miller-Rabin algorithm:

print next_prime(10^301)

# prints in 2.9s on the old single-core system, opposed to fermat's 2.8s
1000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000531

You wouldn't be able to do this with return all(num % i for i in mrange(3, int(sqrt(num)) + 1, 2)) in a timely fashion. I can't even do it on this old system.

And to be sure that next_prime(10^301) and Miller-Rabin yields a correct value, this was also tested using the Fermat and the Solovay-Strassen algorithms.

See: fermat.py, miller_rabin.py, and solovay_strassen.py on gist.github.com

Edit: Fixed a bug in next_prime

Pascal
  • 448
  • 3
  • 11
bnlucas
  • 1,724
  • 1
  • 13
  • 18
  • 2
    Instead of Miller-Rabin, you could use a Baillie-Wagstaff prime tester; it's more accurate (there are no known errors) and, depending on the number of witnesses you try, faster than Miller-Rabin, though it does involve more code. You could also use a wheel to cut the time required to compute the next prime by about half. Both options are shown at http://ideone.com/Qf9ZNG. – user448810 Jun 25 '13 at 17:23
  • Thanks for the input. I'll have to check it out! I resorted to going off pseudocode and things written in Java and C# because I found it some what hard to find an answer to this in Python from searching. – bnlucas Jun 25 '13 at 18:30
  • I was able to speed up your `isBaillieWagstaffPrime` from `10000 iterations: 2.28699994087, 4372 calls per second. 560006 function calls in 2.888 seconds` to `10000 iterations: 0.156999826431, 63694 calls per second. 40006 function calls in 0.234 seconds` benchmarking `100**10-1` You can get the code at [primes.py](https://gist.github.com/bnlucas/5876128) – bnlucas Jun 27 '13 at 13:02
  • 1
    I have two comments on your Baillie-Wagstaff tester. First, you never check for divisibility by 2. Second, I assume that your sqrt function is the normal floating-point built-in function. You may encounter floating point errors that cause squares to be considered non-square, and you may encounter a limit around 10^308 because floating point numbers can't be as large as arbitrary integers. – user448810 Jun 27 '13 at 14:37
  • I assume your speedup came from replacing the call to primes(limit) with a simpler trial division. Is that correct? If so, you could calculate the primes once and store them somewhere, which would be quicker than your method (25 primes less than 100, you test 50). – user448810 Jun 27 '13 at 14:54
  • Thanks for the suggestion on the issue with `sqrt(10**308)` I tested and `10**308` is working, but then you get the error at `10**309`. And the `for i in xrange(3, limit + 1, 2):` is a minor speedup, yes. And the ultimate goal is to store the values to make it even quicker. I have updated the code [here](https://gist.github.com/bnlucas/5876128) and you can see it in action [here](http://ideone.com/8VCKmB) which uses your `for i in primes(limit)` gave me a time of `3.49s` - `3.57s`, using `xrange(3, limit, 2)` gives around the same for `10**308`. – bnlucas Jun 27 '13 at 15:19
  • Then I'll have to look more closely to see how you got the speedup from 2.888 seconds to 0.234 seconds on 10000 iterations using 100**10-1. – user448810 Jun 27 '13 at 15:22
  • Running `10**309` I get an average of `5.85s` [here](http://ideone.com/9jo4zS) and `5.64s` [here](http://ideone.com/8VCKmB). The other speedups were replacing `n % 2 == 0` with `not n & 1` and `n / 2` with `n >> 1`, or `(n +1) >> 1` such as in `strong_lucas_pseudoprime`. The other optimization was replacing all `range` with `xrange`. – bnlucas Jun 27 '13 at 15:25
  • Your `isqrt` does take up some time over the native `math.sqrt` so what I did in `get_sqrt(n)` is if `n` < `10**308` it just uses `sqrt`, if it's over it uses your `isqrt` method. – bnlucas Jun 27 '13 at 15:36
  • Benchmarking it on Ideone has an execution timeout [here](http://ideone.com/3DhHtT) and gives `7.75s` [here](http://ideone.com/pJF3aL) – bnlucas Jun 27 '13 at 15:41
  • I would be worried about the floating point accuracy of math.sqrt. Say for instance that sqrt(25) returns 4.9999999999999999999. Then s^2 is not equal to n and 25 is reported as not a square. – user448810 Jun 27 '13 at 15:52
  • It's probably the change from range to xrange that was the primary speedup. – user448810 Jun 27 '13 at 15:53
  • `xrange` was the most significant. `n & 1` increased by `0.0080s`, `n >> 1` increased `0.0080s` and `randrange` over `randint` increased `0.0380s`. Whereas `xrange` was `3.3590s` faster. Also using `for i in xrange(s)` was faster over `while s > 0` in things like `strong_pseudoprime`. – bnlucas Jun 27 '13 at 16:01
  • I have fixed the issue on the floating point accuracy. See the answer below outlining the two methods and the benchmarks. – bnlucas Jun 27 '13 at 19:31
0

I have written two articles on this, along with benchmarks, to see what methods are faster.

Prime Numbers with Python was written before the knowledge of the Baillie-PSW primality test.

Prime Numbers with Python v2 was written afterwards, benchmarking the Lucas pseudoprimes and the Baillie-PSW primality test.

bnlucas
  • 1,724
  • 1
  • 13
  • 18
0

In response to the possible inaccuracy of math.sqrt I have benchmarked two different methods for performing an isqrt(n) call. isqrt_2(n) is coming from this article and this C code

The most common method seen:

def isqrt_1(n):
    x = n
    while True:
        y = (n // x + x) // 2
        if x <= y:
            return x
        x = y

cProfile.run('isqrt_2(10**308)')

Benchmark results:

isqrt_1 at 10000 iterations: 12.25
Can perform 816 calls per second.

         10006 function calls in 12.904 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   12.904   12.904 <string>:1(<module>)
        1    0.690    0.690   12.904   12.904 math.py:10(func)
    10000   12.213    0.001   12.213    0.001 math.py:24(isqrt_1)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {range}
        2    0.000    0.000    0.000    0.000 {time.time}

This method is incredibly slow. So we try the next method:

def isqrt_2(n):
    if n < 0:
        raise ValueError('Square root is not defined for negative numbers.')
    x = int(n)
   if x == 0:
        return 0
    a, b = divmod(x.bit_length(), 2)
    n = 2 ** (a + b)
    while True:
        y = (n + x // n) >> 1
        if y >= n:
            return n
        n = y

cProfile.run('isqrt_2(10**308)')

Benchmark results:

isqrt_2 at 10000 iterations: 0.391000032425
Can perform 25575 calls per second.

         30006 function calls in 1.059 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    1.059    1.059 <string>:1(<module>)
        1    0.687    0.687    1.059    1.059 math.py:10(func)
    10000    0.348    0.000    0.372    0.000 math.py:34(isqrt_2)
    10000    0.013    0.000    0.013    0.000 {divmod}
    10000    0.011    0.000    0.011    0.000 {method 'bit_length' of 'long' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {range}
        2    0.000    0.000    0.000    0.000 {time.time}

As you can see, the difference in isqrt_1(n) and isqrt_2(n) is an amazing 11.858999967575 seconds faster.

You can see this in action on Ideone.com or get the code

note: Ideone.com resulted in execution timeout for isqrt_1(n) so the benchmark was reduced to 10**200

bnlucas
  • 1,724
  • 1
  • 13
  • 18