Is this more efficient than Sieve of Eratosthenes?

Question

I wrote a code in Python 2.7 for creating list of prime numbers. The code is

def primes_list(num):
    ans = [2]
    for i in range(3, num, 2):
        for j in ans:
            if i % j == 0:
                break
        else:
            ans.append(i)
    else:
        return ans

Is this more efficient than Sieve of Eratosthenes or not? I think the memory-efficiency should be better but I have doubts about time-efficiency. How to calculate the time and memory efficiency and how to benchmark the efficiencies?

http://meta.stackexchange.com/questions/181461/confusion-regarding-faq-continues — Aseem Bansal, May 23 '13 at 05:49

lvc · Answer 1 · 2013-05-22T11:56:20.380

4

What you are doing is trial divison - testing each candidate prime against each known prime below it. Skipping odd numbers in the call to range will save you some divisions, but this is exactly the technique sieveing is based on: you know that every second number is divisible by 2, and therefore composite. The sieve just extends this to:

every third number is divible by three, and therefore composite;
every fifth by five, and therefore composite

and so on. Since the Sieve is regarded as one of the most time-efficient algorithms available, and trial division as one of the least (wikipedia describes the sieve as O(n log log n), and per the comments below, your algorithm is likely O(n^2 / log n)) ) , it is reasonable to assume that trial division with some Sieve-like optimisation falls short of sieving.

edited May 22 '13 at 11:56

answered May 22 '13 at 11:40

lvc

34,233
10
73
98

Those time complexities can't possibly be right. Checking the primality of a single number using trial division is indeed `O(sqrt n)` (if you only try numbers up to sqrt(n), which isn't happening here), but finding all primes up to `n` can't possibly be sub-linear. – sepp2k May 22 '13 at 11:46
1

@sepp2k Probably lvc meant `O(sqrt(n))` for every prime, which should lead to `O(n sqrt(n))` in total, the same is true for the sieve. It's actually an `O(nlog(log(n)))` algorithm. – Bakuriu May 22 '13 at 11:47
4

@Bakuriu That might be what he meant, but the OP's code isn't `O(n sqrt(n))` either. It's `O(n * number_of_primes_below(n))` (and likely Theta that as well), which would be `O(n^2 / log n)` according to what you said in your answer about the number of primes below n being around `n / log n`. – sepp2k May 22 '13 at 11:51
@sepp2k indeed. The missing `n` was a typo, but the overall complexity of trial division is based only on what Wikipedia says. Its logic is hard for me to follow, and yours makes more sense. – lvc May 22 '13 at 11:54
Thanks for the comments and explanation. This will help me. – Aseem Bansal May 22 '13 at 12:03
@lvc it's actually `O(n^2 / log(n)^2)`. Details are in my answer. – Will Ness May 24 '13 at 11:40

Bakuriu · Accepted Answer · 2013-05-22T12:01:08.170

3

No, that's trial division which is much worse than the sieve of eratosthenes in time-complexity. Its space complexity is a bit better, but, since primes are about n/log(n) you are not saving huge quantities of memory. also the sieve can be done using bit-vectors which reduce the constants by 32/64 times(and thus for practical purposes it might be even better).

Small benchmark that shows the difference in timings:

>>> timeit.timeit('primes_list(1000)', 'from __main__ import primes_list', number=1000)
0.901777982711792
>>> timeit.timeit('erat(1000)', 'from __main__ import erat', number=1000)
0.2097640037536621

As you can see, even with n=1000 eratosthenes is more than 4 times faster. If we increase the search up to 10000:

>>> timeit.timeit('primes_list(10000)', 'from __main__ import primes_list', number=1000)
50.41101098060608
>>> timeit.timeit('erat(10000)', 'from __main__ import erat', number=1000)
2.3083159923553467

Now eratosthenes is 21 times faster. As you can see it's clear that eratosthenes is much faster.

using numpy arrays it's quite easy to reduce the memory by 32 or 64(depending on your machine architecture) and obtain much faster results:

>>> import numpy as np
>>> def erat2(n):
...     ar = np.ones(n, dtype=bool)
...     ar[0] = ar[1] = False
...     ar[4::2] = False
...     for j in xrange(3, n, 2):
...             if ar[j]:
...                     ar[j**2::2*j] = False
...     return ar.nonzero()[0]
... 
>>> timeit.timeit('erat2(10000)', 'from __main__ import erat2', number=1000)
0.5136890411376953

An other 4 times faster than the other sieve.

edited May 22 '13 at 12:01

answered May 22 '13 at 11:39

Bakuriu

98,325
22
197
231

My memory concerns were about the flags. They have to be of the order n. I was trying to use bitvectors but they are .tar files. How do I use them? Totally new to Python so don't know how to import custom modules in my script. Any ideas regarding comparing the efficiencies of 2 different python scripts? – Aseem Bansal May 22 '13 at 11:49
@Zel Memory `O(n/log(n))` or `O(n)` doesn't change much. Also you can use an [incremental sieve](http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes#Incremental_sieve) to avoid storing all numbers up to `n` at the same time. – Bakuriu May 22 '13 at 11:57
I looked up timeit.timeit on http://docs.python.org/2/library/timeit.html The last example is simila to what you have used. What is the number argument that you have used in the end of your examples. Also did you define erat just now? – Aseem Bansal May 22 '13 at 12:14
1

@Zel If you are interested in timing algorithms you should learn to use it, and also check the `profile` module. You may also be interested in [this](http://stackoverflow.com/questions/110259/which-python-memory-profiler-is-recommended) question about memory profiling. I ran the code as you see it here(performances may change depending on how your computer handles vectorized operations etc.). The `erat` function is a pure python version of `erat2`, using explicit `for` loops instead of slicing. – Bakuriu May 22 '13 at 12:16
Thanks for the suggestions. Just one thing, how do I add new modules? The explanation that I am finding everywhere is for linux/unix system but I am using Windows. Any suggestions there? – Aseem Bansal May 22 '13 at 12:19
@Zel Some modules provide binaries/installers for windows. For example [here](http://sourceforge.net/projects/numpy/files/NumPy/1.7.1/) you can find the installers for numpy. You may be interested in installing `setuptools` and then use the `easy_install` script to download the packages from [pypi](https://pypi.python.org/pypi). – Bakuriu May 22 '13 at 12:25

score 1 · Answer 3 · edited May 23 '17 at 12:11

"how to benchmark the efficiencies?"

measure empirical orders of growth, that's how! :)

e.g. with data from the accepted answer, log_10(50.41/0.9) = 1.75, and log_10(2.31/0.21) = 1.04, so it's ~ n^1.75 empirical order of growth for your TD (trial division) code (in the range 1,000 ... 10,000) vs. ~ n^1.04 for the sieve of Eratosthenes.

Which, the latter, is consistent with n log log n, and the former, with n^2 / log(n)^2, as it should.

With g(n) = n^2 / log(n)^2, we have g(10000)/g(1000) = 56.25, which is just 0.4% off the empirical value 50.41/0.9 = 56.01. But with g2(n) = n^2 / log(n), we have g2(10000)/g2(1000) = 75, which is way off from the evidence.

About time complexity: actually, most of composites fail early (are multiples of small primes). To produce k=n/log(n) primes takes O(k^2) time here, testing each prime number by all its preceding primes, a.o.t. just those not exceeding its square root.

Composites don't add to the complexity (exact analysis in M. ONeill's JFP article (pg 4) gives the same complexity for composites as for primes when testing up to sqrt - each composite is guaranteed to have a prime factor not greater than its sqrt - so complexity for composites is even less than the complexity for primes here).

So overall it's O(k^2), which is to say, O(n^2/log(n)^2).

By adding just two lines you can improve your code's speed drastically, from O(n^2/log(n)^2) to O(n^1.5/log(n)^2) time complexity:

    for j in ans: 
        if j*j > i: 
            ans.append(i); break; 
        if i % j == 0:
            break
    # else:
    #     ans.append(i)

The improvement in time complexity means the run time ratio of 17.8x for 10,000 over 1,000, instead of 56.25x as before. This translates to ~ n^1.25 empirical order of growth in this range (instead of ~ n^1.75). The absolute run time for 10,000 call will be much closer to the 2.31 seconds than the old 50.41 seconds.

BTW your original code is equivalent to the famous code by David Turner,

primes = sieve [2..]
sieve (x:xs) = x : sieve [y | y <- xs, rem y x /= 0]

and the improved code, to this:

primes = 2 : sieve [3..] primes 
sieve xs (p:ps) | (h,t) <- span (< p*p) xs =
              h ++ sieve [y | y <- t, rem y p /= 0] ps

(the code is in Haskell by I think it's readable enough. x:xs stands for a list with x the head element and xs the rest of list).

score 0 · Answer 4 · edited May 22 '13 at 12:02

0

Since range in Python 2 returns a list, your algorithm's space complexity is still O(n), so it is not more space efficient (at least not asymptotically). If you used xrange (or range in Python 3), it would be somewhat more space efficient because you'd only be storing the primes - not all numbers up to n.

Your time efficiency will be worse than a sieve either way.

edited May 22 '13 at 12:02

Fred Foo

355,277
75
744
836

answered May 22 '13 at 11:41

sepp2k

363,768
54
674
675

1

Didn't understand the last sentence. Can you rephrase that? – Aseem Bansal May 22 '13 at 12:01

Is this more efficient than Sieve of Eratosthenes?

4 Answers4

"how to benchmark the efficiencies?"