Further speeding up of Sieve method of Eratosthenes to find primes

Question

I saw this c code of using Sieve method of Eratosthenes to find primes, but I cannot extend it to even larger integers (for example, to 1000000000 and even larger) because of memory consumption to allocate such a large char array.

What would be the strategies to extend the code to larger numbers? Any references are also welcome.

Thanks.

How is memory consumption a problem? 10**9 bits is about 120 MB, which isn't too much on today's personal computers. Actually, it seems that even 2**32 (the whole range of unsigned 32 bit integers) fits into 512 MB. Sure, that's a lot for a single application, but memory is plenty these days. — , Oct 05 '11 at 20:17
I've checked the results with two other calculators and mental arithmetic suggests the results can't be completely off. 10**9 bit is 125 * 10**6 byte (divide by 8) is roughly 122 * 10**3 kilobyte (divide by 1024) is roughly 119 megabyte (divide by 1024). Please explain how you arrive at 1GB? — , Oct 05 '11 at 20:34
@caf: I assume you meant to make the array global? what is the difference here, static vs. nonstatic? I am not aware of this issue. Thanks a lot. — Qiang Li, Oct 06 '11 at 04:40
No, it is not necessary to make it global, you can just add the `static` storage class specifier. Objects with static storage duration exist just once for the entire invocation of the program, so they are typically not stored on the stack but in a separate memory region. This will make a difference to your case because the stack is normally quite limited in maximum size. — caf, Oct 06 '11 at 05:12
@Jerry Coffin: Sieve of Atkin is faster only for large (>=1e9) numbers http://stackoverflow.com/questions/622/most-efficient-code-for-the-first-10000-prime-numbers/175956#175956 — jfs, Oct 21 '11 at 10:51
Sieve of Atkin implementation for small ~32bit numbers: http://cr.yp.to/primegen.html — jfs, Oct 21 '11 at 10:52

score 2 · Answer 1 · edited May 23 '17 at 10:32

The standard improvement to apply would be to treat each i-th bit as representing the number 2*i+1, thus representing odds only, cutting the size of the array in half. This would also entail, for each new prime p, starting the marking-off from p*p and incrementing by 2*p, to skip over evens. 2 itself is a special case. See also this question with a lot of answers.

Another strategy is to switch to the segmented sieve. That way you only need about pi(sqrt(m)) = 2*sqrt(m)/log(m) memory (m being your upper limit) set aside for the initial sequence of primes with which you'd sieve smaller fixed-sized array, sequentially representing segments of numbers. If you only need primes in some narrow far away range [m-d,m], you'd skip directly to sieving that range after all the needed primes have been gathered, as shown e.g. in this answer.

Per your specifics, to get primes up to 10^9 in value, working with one contiguous array is still possible. Using a bitarray for odds only, you'd need 10^9/16 bytes, i.e. about 60 MB of memory. Easier to work by segments; we only need 3402 primes, below 31627, to sieve each segment array below 10^9.

score 0 · Answer 2 · answered Oct 05 '11 at 20:15

0

Exactly because of the size of the array required, the Sieve of Eratosthenes becomes impractical at some point. A modified sieve is common to find larger primes, (as explained on Wikipedia).

answered Oct 05 '11 at 20:15

driis

161,458
45
265
341

score 0 · Answer 3 · edited May 23 '17 at 12:34

0

You could use gmp library. See Speed up bitstring/bit operations in Python? for the fast implementation of Sieve of Eratosthenes. It should be relatively easy to translate the provided solutions to C.

edited May 23 '17 at 12:34

Community

1
1

answered Oct 21 '11 at 10:42

jfs

399,953
195
994
1,670

Further speeding up of Sieve method of Eratosthenes to find primes

3 Answers3