Sieve of Eratosthenes c++ implementation error

Question

I wrote this implementation of the Sieve of Eratosthenes in c++, but whenever the code reaches 59 (the 16th prime), it stops working. It can only reach 37 on my older machine. When I debug the program, all the variables seem to be working fine; the program simply crashes. Here it is: (I know it has a lot of comments, and lots are unnecessary.)

// Problem 7:What is the 10 001st prime number?

/*This program generates a list of prime numbers and uses the Sieve of Eratosthenes to find them.
 *This is done by crossing out multiples of the first known prime (2), taking the first number
 *not yet crossed out, and setting that as the next prime to "sieve" with.
 */

#include "stdafx.h"
#include <iostream>

using namespace std;

int main()
{
    int placeholder;                    //added at the end to prevent window from closing
    const int sieve_size = 10000;       //amount of #s to sieve through
    const int solution_number = 10001;  //# of primes to generate
    long long solution;                 //the 10 001st prime #
    long long current_sieve;            //current # sieving with
    int number_of_primes = 1;           //we know the first prime -- 2
    long long *primes = new long long [number_of_primes];
    primes[0] = 2;                  //2 is the first prime
    bool flag_havePrime = 0;        //whether we have our next prime yet; used when saving a prime
    bool sieve[sieve_size] = {0};   //0 is "could-be-prime" (not composite), 1 is composite.
    sieve[0] = 1;   //0 and 1 are not prime #s
    sieve[1] = 1;

    for (int i = 0; number_of_primes <= solution_number; i++)   //each loop sieves with a different prime
    {
        current_sieve = primes[i];

        //This next loop sieves through the array starting with the square of the number we will sieve with,
        //this optimizes the run time of the program.
        for (long long j=current_sieve*current_sieve; j <= sieve_size; j++)
            if (j%current_sieve == 0) sieve[j] = 1;

        /*This loop gets our next prime by looking through the array until
         *it encounters a number not crossed out yet. If it encounters a prime,
         *it increments the number of primes, then saves the new prime into
         *primes[]. It also prints that prime (for debugging purposes).
         *The "for" loop ends when it finds a prime, which it knows by encountering
         *the "havePrime" flag. This needs to be reset before every run.
         */

        for (long long j = primes[number_of_primes-1]+1; flag_havePrime == 0; j++)
        {
            if (sieve[j] == 0)
            {
                number_of_primes++;
                primes[number_of_primes-1] = j;             //because array counting starts @ 0
                cout << primes[number_of_primes-1] << endl; //because array counting starts @ 0
                flag_havePrime = 1;
            }
        }
        flag_havePrime = 0; //resetting this flag
    }
    solution = primes[number_of_primes-1];  //because array counting starts @ 0
    delete[] primes;
    primes = 0;

    cout << "The 10 001st prime number is:\n";
    cout << solution << endl;
    cin >> placeholder;
    return 0;
}

I'm thinking it could be an overflow problem?

Here's an updated snippet with only the changes:

const int sieve_size = 500000;
long long *primes = new long long [solution_number];

Debugging returns a (gasp) heap overflow though, but running the compiled version doesn't. The compiled version stops at 104759, going over by 1. That's probably easy enough to fix. But the program doesn't print the last bit, where it gives you the solution. Weird.

Just a note: that is not a Sieve of Eratosthenes. You're checking every number from p² on for divisibility by p: `if (j%current_sieve == 0) sieve[j] = 1;`, that's trial division. — Daniel Fischer, Mar 27 '12 at 03:33
@DanielFischer Oh well. It worked. :P Can you check divisibility in c++ any other way? I know there are divisibility tricks for numbers like 9 or 11, but for larger primes... — Ernest3.14, Mar 27 '12 at 03:47
Sure it works, it's just slower. The trick of Eratosthenes' sieve is to not check divisibility at all. You cross of multiples of `p`, so when you have found a prime `p` in your sieve, you cross off the multiples by incrementing the index by `p`, `for(mult = p*p; mult <= limit; mult += p)`. I've elaborated to varying degrees [here](http://stackoverflow.com/a/8335325/1011995), [here](http://stackoverflow.com/a/8643410/1011995) and if you like long reading, [here](http://stackoverflow.com/a/9704912/1011995). — Daniel Fischer, Mar 27 '12 at 10:38

paxdiablo · Accepted Answer · 2012-03-27T03:36:18.470

2

int number_of_primes = 1;           //we know the first prime -- 2
long long *primes = new long long [number_of_primes];

This will create a one-element array. I'm pretty certain you'll need something bigger than that for storing the primes.

Specifically, as soon as you start setting values like primes[11] (for example), you're into the realm of undefined behaviour.

Perhaps there's a different variable you may want to consider using for the size in the new statement, nudge, nudge, wink, wink, a nod's as good as a wink to a blind horse :-)

There's a few other problems you also have in that code. The main one is that your sieve itself is only 10,000 elements long. The idea of a sieve is that you take a large number of things and filter out those not matching. For what it's worth, the 10,001^st is a touch under 105,000 so your sieve should be at least that big.

Secondly, I've seen people use the square of a number to optimise finding of factors but not in this way:

for (long long j=current_sieve*current_sieve; j <= sieve_size; j++)
    if (j%current_sieve == 0) sieve[j] = 1;

What you want is to start at double the current sieve and add it each time, something like:

for (long long j = current_sieve * 2; j < sieve_size; j += current_sieve)
    sieve[j] = 1;

edited Mar 27 '12 at 03:36

answered Mar 27 '12 at 03:16

paxdiablo

854,327
234
1,573
1,953

Yeah, that too. I changed the sieve size to debug more easily and promptly forgot about it. And I think my optimization still works, but not as efficiently... Anyways, I updated my post with my new code. – Ernest3.14 Mar 27 '12 at 03:40
@Ernest3.14, whatever improvements you may get by starting at n^2 instead of 2n (and I'm not toally convinced that will work yet but I haven't found a counter-example so you may be right) will be totally obliterated by the fact your loop adds one and checks every number. You can just add curr_sv and not check: `(a+1)n = an + n`. – paxdiablo Mar 27 '12 at 03:45
Duh. That does make sense. Are there any other optimizations (in case I need to find _lots_ of primes)? – Ernest3.14 Mar 27 '12 at 04:04
For lots of primes, yes, but I'm talking _lots._ The e-sieve will easily handle huge numbers. Beyond that, the a-sieve (Atkin) may be better. Beyond _that,_ my favorite is the pax-sieve: http://stackoverflow.com/questions/9715256/reporting-all-prime-numbers-less-than-n/9715312#9715312 :-) – paxdiablo Mar 27 '12 at 04:08
I'm expecting that I misunderstand, @pax, but if you're doubting whether starting to cross off at p^2 instead of 2p is guaranteed to be correct: It is, because if n is composite, and d a divisor of n (between 1 and n exclusive), then one of d and n/d at least is not larger than `sqrt(n)`. Without loss of generality, let `d <= sqrt(n)`. Any prime factor `q` of `d` is then `<= sqrt(n)`, so n will be crossed off as a multiple of `q`. – Daniel Fischer Mar 27 '12 at 14:34
No, I think you may have hit the nail on the head there @Daniel, which is why I couldn't find any counter examples :-) As you state, you can start from p^2 but you should still increment by p rather than incrementing by 1 and checking if it's a multiple of p. – paxdiablo Mar 27 '12 at 14:38
Absolutely. As I commented on the OP, the way it's done there, it's not even a Sieve of Eratosthenes but trial division. At least only by primes, which is asymptotically as good as trial division can get. – Daniel Fischer Mar 27 '12 at 14:44

Andrew Tomazos · Answer 2 · 2012-03-27T03:44:40.133

0

Here is a fixed version to compare with:

#include <iostream>
#include <vector>
#include <string>

using namespace std;

int main()
{
    const int sieve_size = 1 << 20;
    const int solution_number = 10001;
    int number_of_primes = 0;
    vector<bool> sieve(sieve_size, true);

    for (int i = 2; i < sieve_size; i++)
    {
        if (!sieve[i])
            continue;

        if (++number_of_primes == solution_number)
        {
            cout << "The 10 001st prime number is:" << i << endl;
            return 0;
        }

        for (long long j = i*2; j < sieve_size; j += i)
            sieve[j] = false;
    }
}

Output:

The 10 001st prime number is:104743

edited Mar 27 '12 at 03:44

answered Mar 27 '12 at 03:32

Andrew Tomazos

66,139
40
186
319

1

Not really good form posting complete solutions to homework or Euler problems, the point is to _guide_ in those cases, not do the work for them :-) – paxdiablo Mar 27 '12 at 03:37
@paxdiablo: To be honest it was quicker to give him a working version for him to compare with than to reverse engineer his solution. Better than nothing. If he hands it in verbatim without understanding it, he is not going to get very far. – Andrew Tomazos Mar 27 '12 at 03:43
@user1131467 Since all you have to submit for Project Euler problems is the answer, even the last line of your answer alone will get the OP far enough. – Daniel Fischer Mar 27 '12 at 14:37
I really do like figuring them out though, ;) – Ernest3.14 Mar 27 '12 at 14:48

Sieve of Eratosthenes c++ implementation error

2 Answers2