2

I was curious about prime numbers and would like to know the most efficient way to find relatively small prime numbers for a range up to say, 10 million. I read that the sieve of erastosthenes (SOE) is the most efficient way to find smaller prime numbers. I implemented SOE using python but had a few questions:

  1. The worst case running time of my algorithm seems to be O(n^2). I'm still learning, so I know this algorithm can be made more efficient.

  2. Is there a difference in the most efficient mathematical way and most efficient programming way to find prime numbers? Mathematically, SOE is one of the fastest, but programming-wise is SOE all that fast?

    def FindPrime(n):
        primes = [2, 3]
        for num in range(4, n):
            notprime = False
            for p in primes:
                if num % p == 0:
                    notprime = True
            if notprime == False:
                primes.append(num)
    
        print primes
    
    print FindPrime(100)
    
MNRC
  • 175
  • 2
  • 12
  • For algo answer: http://stackoverflow.com/questions/453793/which-is-the-fastest-algorithm-to-find-prime-numbers – Jason S Aug 06 '14 at 01:42
  • But vanilla Python is not efficient at anything big & mathy, period. – Jason S Aug 06 '14 at 01:42
  • 5
    @Jason S - Using Python is completely irrelevant to the *asymptotic* complexity/efficiency of algorithms, except maybe in that it provides a particular set of convenient ready-made building blocks. Assuming you use the same algorithm (with same-algorithm building blocks), Python may well be slower than C or C++, but only by constant factors. A key point about asymptotic analysis (big O etc) is that constant factors are irrelevant. –  Aug 06 '14 at 02:00
  • 1
    The question asked about the difference between math & programming efficiency. If it's really just about the algo it is a dupe. – Jason S Aug 06 '14 at 02:03
  • @Jason S - good point, but still - efficiency isn't the only factor in language choice. –  Aug 06 '14 at 02:14
  • 2
    People have already answered your question, but I will just point out that the running time of your algorithm is n^2/log n as a consequence of the prime number theorem (the inner loop is order n/log n). – TheGreatContini Aug 06 '14 at 03:19
  • @TheGreatContini: How did you calculate the inner loop as order n/log n? – MNRC Aug 26 '14 at 06:56
  • See the prime number theorem: http://en.wikipedia.org/wiki/Prime_number_theorem#Statement_of_the_theorem – TheGreatContini Aug 26 '14 at 23:51

2 Answers2

4

First of all, you should know that your algorithm isn't the sieve of Eratosthenes. You're using trial division.

There are a number of improvements that can be made to your implementation.

  1. Use xrange(), which is O(1) memory-wise, not range(), which is O(n).

  2. Skip even numbers in your search: xrange(4, n, 2) steps 2 at a time.

  3. Don't test if a prime p divides n when p > sqrt(n). It is not possible.

As you predicted, these changes don't affect the order of complexity, but you'll see a solid performance improvement.

As for a faster algorithm, first implement a real sieve of Eratosthenes, then try the much faster sieve of Atkin.

salezica
  • 74,081
  • 25
  • 105
  • 166
  • Great to know. Could you implement the sieve of erastosthenes here in python? Would love to see how it is coded in this language. – MNRC Aug 06 '14 at 03:34
  • Sieve of Atkins is only much faster than Erasthones version when you are using the optimized code. See http://cr.yp.to/primegen/primegen-0.97.tar.gz for the current version of the source code from the original authors of the improved sieve. – Gary Walker Aug 06 '14 at 04:01
  • Read the source that I referenced and you will see optimized code. It is algorithmiically opaque and hard to get right. But it does run fast. Suggesting writing the Atkin version is a waste of time for a newbie question; esp. in python. Python is great for a lot of things, all-purpose speed is not one of those things, even with PyPy, Psyco and Pyrex, etc. – Gary Walker Aug 06 '14 at 13:21
  • 1
    Sending a newbie to write C++ is wasting his time. This is learning. – salezica Aug 06 '14 at 22:02
0
  1. uʍop ǝpısdn is right your code is not SOE

  2. you can find mine SOE implementation here

    • which makes the prime finding more efficient then yours solution
  3. complexity of mine implementation of SOE

    • time: T(0.5·n·DIGAMMA(CEIL(SQRT(n))+0.3511·n) if the sqrt(N) is used like suggested in comment inside code
    • time(n= 1M): T(3.80*n)
    • time(n= 10M): T(4.38*n)
    • time(n= 100M): T(4.95*n)
    • time(n=1000M): T(5.53*n)
    • so approximately the run time is: T((0.3516+0.5756*log10(n))*n)
    • so the complexity is O(n.log(n))
  4. difference between speed (runtime) and complexity O()

    • the actual runtime is t=T(f(n))*c
    • for big enough n it converges to t=O(f(n))*c
    • where O() is the time complexity of algorithm
    • and T() is actual run time equation for any n (not O() !!!)
    • the c is some constant time which is needed to process single pass in all fors together etc...
    • better O() does not mean faster solution
    • for any n only after treshold where
    • O1(f1(n))*c1 < O2(f2(n))*c2
    • so if you well optimize the c constant then you can beat better complexity algorithms up to a treshold
    • for example your code is around T(n.n/2) -> O(n^2)
    • but for low n can be faster then mine SOE O(n.log(n))
    • because mine needs to prepare tables which takes more time then your divison up to a point
    • but after that yours get much much slower ...

So for the question if there is difference between most efficient math and programing solution

  • the answer is YES it can be on defined N-range
Community
  • 1
  • 1
Spektre
  • 49,595
  • 11
  • 110
  • 380