6

The question is to find the 1000th prime number. I wrote the following python code for this. The problem is, I get the right answer for the 10th , 20th prime but after that each increment of 10 leaves me one off the mark. I can't catch the bug here :(

count=1            #to keep count of prime numbers
primes=()          #tuple to hold primes
candidate=3        #variable to test for primes
while count<20:
    for x in range(2,candidate):
        if candidate%x==0:
            candidate=candidate+2
        else : pass
    primes=primes+(candidate,)            
    candidate=candidate+2
    count=count+1
print primes        
print "20th prime is ", primes[-1]

In case you're wondering, count is initialised as 1 because I am not testing for 2 as a prime number(I'm starting from 3) and candidate is being incremented by 2 because only odd numbers can be prime numbers. I know there are other ways of solving this problem, such as the prime number theorem but I wanna know what's wrong with this approach. Also if there are any optimisations you have in mind, please suggest.

Thank You

Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
gsin
  • 307
  • 1
  • 3
  • 5
  • 1
    Tuple is immutable, you are making a copy of the tuple each time, instead you should initialize it as a list (primes=[]) and user primes.append() - this should not have too much effect on the speed though... – Kimvais Jan 03 '10 at 19:03
  • Hm ... good question. Probably not the fastest implementation. I always wanted to build an efficient generator of primes that can go on for as long as it has enough memory to compute what it needs. The famous Sieve will not work then - you do not want to choose an upper limit on the numbers you look at before you start. But, O(n) complexity for every new number inspected is also crap. Ideally you want to have a list and a set of primes computed so far, and them to a modulo test against that. I have not been able to write this out fully ... I know Haskell language has a primes generator. – Hamish Grubijan Jan 03 '10 at 19:07
  • @SilentGhost Why? @Kimvais I know, I'm creating a new tuple each time, but using a list wouldn't solve my problem either @Ipthnc That's exactly what I had in mind, but even I couldn't write it down, maybe I'll put in more time tm. for that My problem is still unsolved though – gsin Jan 03 '10 at 19:10
  • 3
    This sounds suspiciously like a Project Euler problem. – JesperE Jan 03 '10 at 20:00
  • 5
    A general remark: It is sufficient to divide n by all numbers up to square root of n to find out whether n is prime or not (you are testing up to n) – Felix Kling Jan 03 '10 at 20:43

10 Answers10

8

There is a nice Sieve of Eratosthenes generator implementation in test_generators.py:

def intsfrom(i):
     while 1:
         yield i
         i += 1

def firstn(g, n):
     return [g.next() for i in range(n)]

def exclude_multiples(n, ints):
     for i in ints:
         if i % n:
             yield i    

def sieve(ints):
     prime = ints.next()
     yield prime
     not_divisible_by_prime = exclude_multiples(prime, ints)
     for p in sieve(not_divisible_by_prime):
         yield p

primes = sieve(intsfrom(2))

>>> print firstn(primes, 20)
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]

alt text

Community
  • 1
  • 1
jbochi
  • 28,816
  • 16
  • 73
  • 90
  • 5
    Actually this code is not an implementation of the Sieve of Eratosthenes, as it computes the remainder of each candidate divided by every prime already found... The SoE performs *no divisions* at all, just addition (which is all that is necessary to cross out numbers at fixed increments). There is a nice paper by Melissa E. O'Neill -- "The Genuine Sieve of Eratosthenes" -- which discusses the difference between this type of naive implementation and a real SoE in the context of Haskell, though it is not at all Haskell-specific, or even FP-specific, and could easily be rewritten in Python. – Michał Marczyk Jan 04 '10 at 00:16
  • 1
    Continuing past the comment length limit... There's a *huge* difference in performance between a proper SoE and this kind of naive sieve, although that certainly doesn't prevent the naive sieve from being quite appropriate in the context of testing Python's generator implementation. :-) – Michał Marczyk Jan 04 '10 at 00:18
  • @Michal. Very well noticed! It's not the real Sieve indeed. Thanks. – jbochi Jan 04 '10 at 01:47
5

There is lots (!) to be improved with your Python code, but to answer your specific question:

When you have found a divisor (candidate % x == 0), you increment the candidate, but you don't do anything about x. This can lead to two problems:

  1. candidate might have a divisor, but it's smaller than any x that gets tested -- because testing in the next iteration of the loop starts with x being one higher than the value x had before; not at 2.
  2. candidate might have a divisor, but it's larger than x ever gets, because you take x from the values from 2 to the value candidate had when you started the loop.
balpha
  • 50,022
  • 18
  • 110
  • 131
3

I don't think this is testing what you think it's testing. It looks like you're trying to say "for each number between 2 and my candidate, check to see if the candidate is evenly divisible by that number". However, when you find a prime (candidate%x == 0), you're only incrementing the candidate-- you still need to start your "for x in ..." loop over again, since the candidate has changed.

That's what I can see from the code as written; there are of course lots of other ways and other optimizations to use here.

Ben Zotto
  • 70,108
  • 23
  • 141
  • 204
2

It's good to know that every prime number bigger than 3 can be written as: 6k-1/+1.

When you are looking for the next candidate, you can always write something like this (code snippet is in C):

a=1;
...
candidate=6*k+(a=(a==-1)?1:-1);
if(a==1){
           k++;
}

And a function I've used not so long ago to determine the nth prime number, where LIM is the nth number you are looking for (C code):

int sol2(){
        int res,cnt,i,k,a;
        res=-1;
        i=1;
        cnt=3;
        k=1;
        a=1;
        while (1){
                if (util_isprime(cnt)){
                        i++;
                        if (i==LIM){
                                res=cnt;
                                break;
                        }
                }
                /* 6k+/-1 starting from 6*1-1 */
                cnt=6*k+(a=(a==-1)?1:-1);
                if(a==1){
                        k++;
                }
        }
        return res;
}
Andrei Ciobanu
  • 12,500
  • 24
  • 85
  • 118
2

in the statement:

for x in range(2,candidate)

you can reduce the number of iterations by scanning up to sqrt(candidate)

If candidate is divisible by x, then we can write candidate=x*b for some b. If x is less than or equal to b, then x must be smaller than or equal to the square root of candidate

Pierre
  • 34,472
  • 31
  • 113
  • 192
0

As for optimizations, if you're definite that you want to follow this implementation, you could avoid looking at numbers that:

  1. end in 5, since they are divisible by 5.
  2. are comprised by the same digits, e.g. 22, 33, 44, 55, 66, etc., since they are divisible by 11.

Don't forget to add 5 and 11 as primes though!

Alex Ntousias
  • 8,962
  • 8
  • 39
  • 47
  • 1
    it would be easy if the divisors could be a list of previously collected primes. I think I'll go with that approach. Just to get it down in code now!! – gsin Jan 03 '10 at 19:29
  • 2
    @Alex: Assuming the OP is using a binary computer, checking whether the last 10-base digit is 5 comes down to the same thing as dividing by 5 in the first place. A similar argument holds for your second point. – balpha Jan 03 '10 at 19:32
  • 1
    #2 is not really an optimization on a binary computer. #1 can be generalized to implement something called a 'wheel'. You take the first k primes, say for example 2, 3, 5, and then only try candidates that are not 0 mod any of these primes. So, mod 30 that means you only try candidates have the following remainders mod 2*3*5: [1, 7, 11, 13, 17, 19, 23, 29]. Thus, you can have something like this: primes=[2,3,5]; for base in xrange(0, 10000000, 2*3*5): for rem_mod30 in [1, 7, 11, 13, 17, 19, 23, 29]: candidate = base + rem_mod30 # check if candidate is prime – President James K. Polk Jan 03 '10 at 19:46
0

Unless I'm very much mistaken, you are always adding the current candidate to the list of primes, regardless of whether a divisor has been found or not. Your code to append to the list of primes (putting aside the immutable tuple comment made earlier) is outside of the test for integer divisors, and therefore always run.

tw39124
  • 9,103
  • 2
  • 20
  • 14
0

If you want anything remotely efficient, do the Sieve of Eratosthenes - it's as simple as it's old.

MAX = 10000
candidates = [True] * MAX
candidates[0] = False
candidates[1] = False

primelist = []
for p,isprime in enumerate(candidates):
    if isprime:
        primelist.append(p)
        for n in range(2*p,MAX,p):
            candidates[n] = False

print primelist[1001]
Jochen Ritzel
  • 104,512
  • 31
  • 200
  • 194
  • I have seen this, but what if I want the first 20,000 prime numbers? It is hard to pick the MAX then. Also, what if I want to go on indefinitely? – Hamish Grubijan Jan 03 '10 at 19:54
  • there are about `x/log x` primes below `x` so you can just compute the MAX. There are other algorithms to find very large primes. – Jochen Ritzel Jan 03 '10 at 21:30
0

FYI... I solved it with the following code, though it can be optimised much much more, I just wanted to solve it in this manner first. Thanks everyone for your help.

from math import *
primes=[2,3]
count=2
testnumber=5
while count<1000:

    flag=0
    for x in range(2,testnumber):
        if x<=sqrt(testnumber):
            if testnumber%x==0:
                #print testnumber , "is not a prime"
                flag=1

            else : pass
    if flag!=1:
        #print testnumber , "is a prime"
        primes=primes+[testnumber]
        count=count+1
    testnumber=testnumber+2


#print primes
print "1000th prime is ", primes[-1]

I will now look at all the other algorithms mentioned by you all

gsin
  • 307
  • 1
  • 3
  • 5
0

c beginner

#include<stdio.h>
int main ()

{
int a,s,c,v,f,p,z;

while(scanf("%d",&f) !=EOF){
p=0;
for(z=1;p<f;z++){
                s=2;
                a=z;
                while(s<a){
                          if(a%s==0)s=a+1;
                          else s=s+1;
                          }
                if (s!=a+1)p++;

                }
printf("%d\n",a);
                            }

return 0;
}
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Saikot
  • 1