Segmented sieve of eratosthenes: sieving composite numbers by their indexes?

Question

I'm trying to code a prime number finder that prints primes between two given values. I've got no problem with coding traditional sieve, but when it comes segmented, my python knowledge is coming up short. Here is what I've done so far:

def primes(n): # traditional sieve finding primes up to sqrt(n)
    myPrimeList= []
    mySieve= array('B', [True]) * (int(n**0.5)+1)
    for i in range(2,int((n**0.5)+1)):
        if mySieve[i]:
            myPrimeList.append(i)
            for x in range(i*i,int(n**0.5)+1,i):
                mySieve[x]= False
    return myPrimeList

def rangedprimes(x,y):
    output = []
    sieve = [True] * (y-x+1)
    primeList = primes(y) # primes up to sqrt(y)
    minimums = [(x//m)*m for m in primeList if x>=m] # multiplying primes so they get close to the lower limit
    zipped = list(zip(primeList, minimums)) # just zipped to see it clearer, contributes nothing
    return zipped

print(primes(20))
print(rangedprimes(10,20))

[2, 3] # primes up to sqrt(20)
[(2, 10), (3, 9)] # primes and their smallest multiples

Now, according to the algorithm, I have to turn these numbers' [10, 12, 14, 15, 16, 18, 20] values from True to False in the sieve so that the remaining numbers, which will be marked with True, can be prime numbers. At this point, I can't make that happen as I've got a sieve containing True only y-x+1 times, which means it has indexes from 0 to y-x. For example, how can 16 or 20 can be marked with False in a sieve where the last index number will only be 10? If the starting index number of the sieve were to be 10, and the last index number be 20, I could find the numbers in the sieve by looking their indexes and make them False.

What should be the relationship between the sieve and the composite numbers between the range in this case?

Your code should really be: 1. get list of primes from `1` to `y`. 2. return primes between `x` and `y`. It's not clear why your "sieve" stops at `sqrt(n)`. — jonrsharpe, Aug 06 '14 at 10:01
@jonrsharpe Because primes up to `sqrt(n)` is enough to find the primes between `x,y`. For example, you don't have to decide whether 7 is prime or not if your upper limit is 20. The algorithm will cross out 14(the only multiple of 7 up to 20) because of 2. It's just an optimization thing, but my issue is, I can find the numbers that must be gone, however, I can't, because of my lack of python knowledge. — user2694307, Aug 06 '14 at 10:07
You only have to *check* up to the `sqrt`, but the *sieve array* should contain all numbers up to `n`. `primes(20)` should give `[2, 3, 5, 7, 11, 13, 17, 19]`, not just `[2, 3]`. — jonrsharpe, Aug 06 '14 at 10:19
@jonrsharpe At first, I did this in the way you are suggesting. But, it was inefficient for the values such as _10^6_ or _10^9_. Let's say you wanted to find the primes between _10^9_ and _10^9 + 1000_. You would have to find all the primes up to _10^9 + 1000_, which is really slow and space-required. In the way I'm trying to do now, you don't need _10^9 + 1000_ space, you can handle it making a sieve containing only _1000_ `True`. — user2694307, Aug 06 '14 at 10:35
Right, I think I see - so you generate all primes up to `sqrt(y)`, which you can then use to determine which numbers between `x` and `y` are prime? — jonrsharpe, Aug 06 '14 at 10:37
@jonrsharpe Exactly, you can read more extensive explanation of this here: [explanation](http://turjachaudhuri.wordpress.com/2013/12/14/spoj-prime-1-segmented-sieve-of-eratosthenes/) And here is my first code if you want to look at, which was problematic for big values: [my first code](http://ideone.com/KbTlij) — user2694307, Aug 06 '14 at 10:45
Yep, with you now - I've updated my answer, sorry for the misinterpretation — jonrsharpe, Aug 06 '14 at 10:55
@WillNess I realized that you have too many educational answers on this topic. After reading them all, I will try to do it again. But I think, as you said, my problem is with python coding. — user2694307, Aug 06 '14 at 11:57
simply translate between index and value with index = value - start. use same code as you already have. pseudocode is [here](http://stackoverflow.com/a/19641049/849891). Just work with *two* arrays: one 1...sqrt(m), the other n...m . — Will Ness, Aug 06 '14 at 11:59

jonrsharpe · Answer 1 · 2014-08-06T10:48:22.000

Here is what I think you're trying to do:

import math

def prime_sieve(n):
    """Use the Sieve of Eratosthenes to list primes 0 to n."""
    primes = range(n+1)
    primes[1] = 0
    for i in range(4, n+1, 2):
        primes[i] = 0
    for x in range(3, int(math.sqrt(n))+1, 2):
        if primes[x]:
            for i in range(2*primes[x], n+1, primes[x]):
                primes[i] = 0
    return filter(None, primes)

def ranged_primes(x, y):
    """List primes between x and y."""
    primes = prime_sieve(int(math.sqrt(y)))
    return [n for n in range(x, y) if all(n % p for p in primes)]

Note that I've kept a traditional sieve all the way to n, then called it to sqrt(y) in the ranged_primes function.

A demo from 10**6 to 10*6 + 10**3:

>>> ranged_primes(10**6, 10**6+10**3)
[1000003, 1000033, 1000037, 1000039, 1000081, 
 1000099, 1000117, 1000121, 1000133, 1000151, 
 1000159, 1000171, 1000183, 1000187, 1000193, 
 1000199, 1000211, 1000213, 1000231, 1000249, ...]

matches the results shown by Wolfram Alpha.

The `ranged_primes` function you've coded had come to my mind, but I tried to avoid the use of `modulo`. Because now it's become efficient for close ranged limits like in your example(10**6 to 10**6 + 10**3) and it is okay, but it is really slow if the _range_ is to be something like a _million_. That's why I wanted to use one more sieve. I mean, we could also find primes 1 to n using the modulo and that would be okay if we were to find 1 to 1000 because of a close range. But 1 to a million, modulo is a pain I think. — user2694307, Aug 06 '14 at 11:12
@user2694307 yes, I see. If you want to mark non-primes in an array starting from e.g. `10`, just subtract `10` from `i` to get the actual index. — jonrsharpe, Aug 06 '14 at 11:17

Segmented sieve of eratosthenes: sieving composite numbers by their indexes?

1 Answers1