0

I need to generate a large number of prime numbers, however it is taking far too long using the Sieve of Eratosthenes. Currently it takes roughly 3 seconds to generate primes under 100,000 and roughly 30 for primes under 1,000,000. Which seems to indicate an O(n) complexity but as far as I know that's not right. Code:

def generate_primes(limit):
    boolean_list = [False] * 2 + [True] * (limit - 1)
    for n in range(2, int(limit ** 0.5 + 1)):
        if boolean_list[n] == True:
            for i in range(n ** 2, limit + 1, n):
                boolean_list[i] = False

Am I missing something obvious? How can I improve the performance of the sieve?

  • If your algorithm is O(n) congratulations, it's faster than any known implementation (it's usually O(n log log n) according to the Wikipedia page). But it's not clear why that's a problem. – Denziloe Aug 30 '18 at 20:09
  • 1
    `if boolean_list[n] == True:` => `if boolean_list[n]:` apart from that I don't see how it could be improved – Jean-François Fabre Aug 30 '18 at 20:09
  • maybe use `numpy` to set to false with vectorized indices to avoid the inner loop? – Jean-François Fabre Aug 30 '18 at 20:11
  • use an integer set instead of list? – Serge Aug 30 '18 at 20:11
  • @Denziloe Maybe I've misunderstood the time complexity for it. I was under the assumption that the Sieve of Eratosthenes was meant to run faster and I wasn't sure if there was a mistake with my implementation of it – danjedwards123 Aug 30 '18 at 20:12
  • @danjedwards123 Faster than what? – Denziloe Aug 30 '18 at 20:13
  • @Denziloe Just in general. Are there any quicker ways to generate large primes? – danjedwards123 Aug 30 '18 at 20:16
  • n ** 2 => n * n – Serge Aug 30 '18 at 20:17
  • is it correct btw? n ** 2 looks suspitious – Serge Aug 30 '18 at 20:19
  • @danjedwards123 Yes, I'm sure there are faster ways than Eratosthenes. Is your implementation of Eratosthenes worryingly slow? I don't see any reason to think so. Python is a high-level language and iterating in Python can often be around 100 times slower than using a lower-level language such as C. If you want to use Python you might want to look into `numpy` which provides an interface to fast arrays written in C. – Denziloe Aug 30 '18 at 20:21
  • @Serge They are the same in this case as ** is the power operator – danjedwards123 Aug 30 '18 at 20:22
  • @danjedwards123 I ran your code basically unchanged on a Raspberry pi 1 model B, no overclocking and got 0.6 seconds and 6.65 seconds respectively for limits 10^5 and 10^6. Are you perhaps trying it out in debugging mode in an IDE? that could add delays on each iteration and this code has lots of them – Ilia Gilmijarow Aug 30 '18 at 20:39
  • use xrange for speed – Serge Aug 30 '18 at 20:47
  • tried n * n, 2ms improvement on my pc! – Serge Aug 30 '18 at 20:53
  • xrange has no advantage over range in python3, I think – Ilia Gilmijarow Aug 30 '18 at 20:56
  • Please see: [Why is “Is this correct?” an off topic question, and what should I ask instead?](https://meta.stackoverflow.com/questions/359466/why-is-is-this-correct-an-off-topic-question-and-what-should-i-ask-instead) – EJoshuaS - Stand with Ukraine Aug 30 '18 at 23:28

2 Answers2

1

Loop indexing is well known in Python to be an incredibly slow operation. By replacing a loop with array slicing, and a list with a Numpy array, we see increases @ 3x:

import numpy as np
import timeit

def generate_primes_original(limit):
    boolean_list = [False] * 2 + [True] * (limit - 1)
    for n in range(2, int(limit ** 0.5 + 1)):
        if boolean_list[n] == True:
            for i in range(n ** 2, limit + 1, n):
                boolean_list[i] = False
    return np.array(boolean_list,dtype=np.bool)

def generate_primes_fast(limit):

    boolean_list = np.array([False] * 2 + [True] * (limit - 1),dtype=bool)
    for n in range(2, int(limit ** 0.5 + 1)):
        if boolean_list[n]:
            boolean_list[n*n:limit+1:n] = False
    return boolean_list

limit = 1000

print(timeit.timeit("generate_primes_fast(%d)"%limit, setup="from __main__ import generate_primes_fast"))
# 30.90620080102235 seconds

print(timeit.timeit("generate_primes_original(%d)"%limit, setup="from __main__ import generate_primes_original"))
# 91.12803511600941 seconds

assert np.array_equal(generate_primes_fast(limit),generate_primes_original(limit))
# [nothing to stdout - they are equal]

To gain even more speed, one option is to use numpy vectorization. Looking at the outer loop, it's not immediately obvious how one could vectorize that.

Second, you will see dramatic speed-ups if you port to Cython, which should be a fairly seamless process.

Edit: you may also see improvements by changing things like n**2 => math.pow(n,2), but minor improvements like that are inconsequential compared to the bigger problem, which is the iterator.

Daniel R. Livingston
  • 1,227
  • 14
  • 36
  • 1
    `math.pow` is quite a bit slower than using the exponent operator, actually. https://stackoverflow.com/questions/20969773/exponentials-in-python-x-y-vs-math-powx-y – mypetlion Aug 30 '18 at 20:46
  • 1
    Thanks so much for this. I wasn't expecting such an improvement, and I'll look into the other two recommendations you gave. – danjedwards123 Aug 30 '18 at 20:55
  • @mypetlion - that's interesting, I didn't know that. It always surprises me how high the overhead for function calls is. Thanks! – Daniel R. Livingston Aug 30 '18 at 22:45
  • @danjedwards123 - good luck! If speed is critically important and you don't want to move to a compiled language, try checking out Julia - it's an insanely fast JIT language that is perfectly suited for this usecase :) – Daniel R. Livingston Aug 30 '18 at 22:46
0

If your are still using Python 2 use xrange instead of range for greater speed

Serge
  • 3,387
  • 3
  • 16
  • 34