I am trying to find a way to find the distribution of prime gaps for primes less than 100,000,000.
My method:
Step 1: Start with a TXT file "primes.txt" which has a list of primes (up to 10,000,000).
Step 2: Have the program read the file and then insert each number into a list, p1.
Step 3: Find the square root of the upper bound (10 times the upper bound of the primes in the TXT file, in this case, 100,000,000) and create another list, p2, with all the primes less than or equal to that square root, rounded up.
Step 4: Define an isPrime() method that checks whether the input is a prime number (N.B.: because I know that the numbers that will be checked are all less than 100,000,000, I only have to check whether the number is divisible by all the primes less than or equal to the square root of 100,000,000, which is 10,000)
Step 5: Add a list l which collects all the prime gaps, then iterate from 1 to 100,000,000, checking the primality of each number. If the number IS prime, then record the gap between it and the last prime number before it, and also write it to another document "primes2.txt".
Step 6: Output the list l.
The problem:
The program seems to take a LONG time to run. I have a feeling that the problem has to do with how I am managing the list due to its size (the Prime Number Theorem estimates about 620,420 elements in that list from "primes.txt"). Is there a way to reduce the runtime for this program by handling the list differently?
I've attached my code below.
import math
import sys
f = open("primes.txt","r")
p1 = []
for i in f:
p1.append(int(i))
f.close()
ml = 10000000
ms = math.sqrt(10*ml)
p2 = []
x1 = 0
while x1 < len(p1) and p1[x1] <= int(ms+0.5):
p2.append(p1[x1])
x1 += 1
def isPrime(n):
for i in p2:
if n%i == 0:
if n/i == 1:
return True
return False
return True
def main():
l = [0]*1001 #1,2,4,6,8,10,12,...,2000 (1, and then all evens up to 2000)
lastprime = -1
diff = 0
fileobject = open("primes2.txt",'w')
for i in xrange(1,10*ml):
if isPrime(i):
if i > 2:
diff = i - lastprime
if diff == 1:
l[0] += 1
else:
l[diff/2] += 1
lastprime = i
fileobject.write(str(i)+"\n")
if i%(ml/100) == 0:
print i/float(ml/10), "% complete"
fileobject.close()
print l
main()
EDIT: Made a change in how the program reads from the file