Erathostenes sieve optimization

Question

I've written the erathostenes algorithm in python a few weeks ago, and it looked like the following:

def erathostenes(n):

    A = range(2,n+1)
    B = []
    i = 0

    while A[i] < math.sqrt(n):
        B.append(A[i])
        j = i
        aux = A[i]
        while j < len(A):
            if A[j]%aux == 0:
                A[j] = 0
            j += aux
        i += 1
        while A[i] == 0:
            i +=  1
    for i in range(len(A)):
        if A[i] != 0:
            B.append(A[i])
        i += 1
    return B

After thinking a little (I'm noob in programming) i just did some modifications in my algorithm an right now looks like:

def erathostenes(n):

    A = range(2,n + 1)
    B = []
    i = 0

    raiz = math.sqrt(n)
    lenA = len(A)       
    rangeLenA = range(lenA)

    while A[i] < raiz:
        B.append(A[i])
        j = i
        aux = A[i]

        while j < lenA:
            A[j] = 0
            j += aux
        i += 1
        while A[i] == 0:
            i +=  1
    for i in rangeLenA:
        if A[i] != 0:
            B.append(A[i])
        i += 1
    return B

If I execute the algorithm with n=10.000.000 the execution time in the first code is approximately 7 sec and with the second code it's about 4 seconds.

Any ideas about some more optimizations in my algorithm? thanks!

There exist other algorithms, e.g. http://en.wikipedia.org/wiki/Sieve_of_Atkin — NPE, Oct 26 '13 at 15:08

jimifiki · Accepted Answer · 2013-10-26T15:54:10.640

4

i += 1

in the last loop is funny.

Consider replacing

for i in rangeLenA:

with

for i in xrange(LenA)

you avoid generating a huge list you don't need.

EDIT:

Also consider this:

    for j in xrange(i,lenA,aux):

instead of:

    while j < lenA:

And fix the bug

while A[i] <= raiz:

as suggested by fryday.

edited Oct 26 '13 at 15:54

answered Oct 26 '13 at 15:19

jimifiki

5,377
2
34
60

Let me link my own code http://stackoverflow.com/questions/9043791/python-eratosthenes-sieve-algorithm-optimization/9043972#9043972 :-) almost two times faster on my laptop. Use it like this: A = [i for i in primes_sieve2(10000000)] – jimifiki Oct 26 '13 at 15:30

score 2 · Answer 2 · answered Oct 26 '13 at 15:44

2

There is a error in your code. Change

while A[i] < raiz:

on

while A[i] <= raiz:

You can found error when N is square.

For opimization use xrange for rangeLenA instead of range.

answered Oct 26 '13 at 15:44

fryday

387
2
14

I didn't know xrage existed, thanks for the info. Is xrange a newer version of range, I supose? – Antoni Oct 26 '13 at 18:45
1

**range(n)** gets you list form 0 to n-1 e.g. range(5) = (0, 1, 2, 3, 4) **xrange(n)** gets you xrange object that yields values from list same to range(n) You can read about it in more detail here: http://docs.python.org/2/library/functions.html#xrange – fryday Oct 27 '13 at 16:37

Lucas Ribeiro · Answer 3 · 2013-10-26T18:50:05.813

Tried to make a non-loop version just for fun. It came out like this:

def erathostenes(n):

    def helper_function(num_lst, acc):

        if not num_lst:
            return acc
        if len(num_lst) == 1:
            acc.append(num_lst[0])
            return acc
        num = num_lst.pop(0)
        multiples = ([x for x in range(num + 1, num_lst[-1] + 1) 
                         if x % num == 0])

        remains = ([x for x in num_lst if x not in multiples])
        acc.append(num)
        return helper_function(remains, acc )
    return helper_function(range(2, n + 1), [])

When i ran the timing, got 826 us for the post erathostenes(1000), and 26ms for my version (!!). Surprised me it was so slow.

Functional programming it's more fun, but looks isn't the right fit for this problem, in Python (my guess is that it would be faster in a more functional language).

So i tried a imperative version. It looks like this:

def erathostenes_imperative(n):
    limit = int(math.sqrt(n))
    def helper_function(flags, size):
        for i in range(2,limit):
            if flags[i] == True:
                j = 2*i
                while j < size:
                    if j % i == 0:
                        flags[j] = False
                    j = j + i
        return [x for x in range(2, n + 1) if flags[x]]
    return helper_function([True]*(n + 1), n)

What i did was changing the list of integers into a list of True/False flags. Intuitively, looks like it's faster to iterate, right?

My results where 831ms for erathostenes_imperative(100000), vs. 1.45 in your version.

It's a shame that imperative writting it's faster. The code look so messy with all the fors, whiles, i's and j's

i'm not sure if your imperative version is totally correct. If you executed for n=10.000.000, the prime numbers should be 664579 and your version gives a result of 664580 (one more). Let's see if I can fix the problem. Thanks! — Antoni, Oct 26 '13 at 18:40
Lol. Just edited, edited back not to spoil. Feel free to edit my post after you find the bug! — Lucas Ribeiro, Oct 26 '13 at 18:51

score 1 · Answer 4 · answered Oct 26 '13 at 15:58

1

Try the Sieve of Atkin. It's similar, but it a modification of the Sieve of Eratosthenes, and it filters out all multiples of 2, 3, 5 right off, as well as a few other optimizations. You might also want to try to find a tool that tells you the run time of each operation and modify the operations with a larger run time.

However, since you're new to programming, you might best be served either implementing other algorithms, or doing other programming exercises to improve.

answered Oct 26 '13 at 15:58

jfa

1,047
3
13
39

It must be pointed out that one can apply wheel factorization to the Sieve of Atkin just as it's built in to the Sieve of Atkin, and in fact for the SoE can use higher levels of wheel factorazation whereas the SoA algorithm has the 2/3/5 wheel baked in. A SoE program with maximum wheel factoization written to the same level of complexity as a basic SoA will run faster than the SoA up to a range of something well over a billion, then will gradually get slower for the "one huge memory array" type of implementation, but this is for huge memory array sizes that may not be practical. – GordonBGood Apr 03 '15 at 06:03

Erathostenes sieve optimization

4 Answers4