C++ prime numbers interval

Question

I have to write script to print prime numbers in given range. Currently I am using the following script:

#include <iostream>
using namespace std;
int main()
{
    int n,p,m;
    cin>>n >> m;
    int * arr;

    arr= new  int[m+1];

    for (int i=0; i<=m; i++)
    {
      arr[i]=0;
    }

    for(int i=2;i<=m;i++){
        if(arr[i]==0)
        {   p=i;
            for (int j=2;p*j<=m;j++)
            {
                arr[p*j]=1;
            }
        }
    }

    for(int i=n;i<=m;i++){
        if(arr[i]==0)cout<<i<<endl;
    }
    delete[] arr;
    return 0;
}

It works fine for small inputs (it prints 1 as prime but that is easy to fix). However, when I input numbers like 1999998973 and 1999999973 it crashes with bad_alloc.

I know I probably made a huge array, but I don't know how to fix it. I have tried a different algorithm but it was really slow, I need it to print the numbers within about 2 seconds or so. Does anyone have an idea?

Presumably you don't have `sizeof(int) * 1999999973` of contiguous memory available, which isn't surprising. — user657267, Oct 18 '14 at 12:18
Could you identify where exactly you get `bad_alloc`? I suspect that your program fails at `ar = new int[m+1];` which means that your program cannot allocate that much contiguous space for an array (either because you don't have such a huge block available or because your program is not allowed to get that much memory) — UnholySheep, Oct 18 '14 at 12:19
No more information given - just bad_alloc.Ye I thought so but how can I fix it? I tried several ways but I just cant fix it. Sometimes program runs too slow, sometimes it throws errors. Really I am screwed. — John, Oct 18 '14 at 12:43
make it `new [m+1 - n]`, which means you only allocate space for the amount of numbers in that range. You have to map the numbers to the array then tho — Creris, Oct 18 '14 at 12:46

score 0 · Answer 1 · answered Oct 18 '14 at 14:05

This seems to work for the numbers I looked at from 2 to 500. It doesn't blow up with the 1999998973 to 1999999973 range and gives these results which I don't know if they are correct or not:

1999999003 1999999013 1999999049 1999999061 1999999081 1999999087 1999999093 1999999097 1999999117 1999999121 1999999151 1999999171 1999999207 1999999219 1999999271 1999999321 1999999373 1999999423 1999999439 1999999499 1999999553 1999999559 1999999571 1999999609 1999999613 1999999621 1999999643 1999999649 1999999657 1999999747 1999999763 1999999777 1999999811 1999999817 1999999829 1999999853 1999999861 1999999871 1999999873 1999999913 1999999927 1999999943 1999999973

int n,p,m;
cout << "enter two numbers" << endl;
cin >> n >> m;
int size = (m - n) + 1; 

int * arr = new  int[size];

// initialize the array to all zeros
for (int i=0; i < size; i++)
{
  arr[i]=0;
}
// loop once for the entire set of numbers from 2 to max value to 
// multiply with (sqrt of top of range)
int max = sqrt(m);

for(int i = 2; i <= max ;i++)
{
    p = i;
    for (int j=2 ; p*j <= m ; j++)
    {
        if ( (p * j) >= n  )
        {
           arr[(p*j) - n] = 1;
        }
    }
}

for(int i = 0; i < size; i++)
{
    if ( arr[i]==0 )
        cout << (i + n) << endl;
}
delete[] arr;

Christophe · Answer 2 · 2014-10-18T15:15:25.870

0

The problem is the space of your array.

Option 1: you can reduce the array to the size of the range. You gain space, but you have extra calculation, because you have to consider all numbers outside the range in your eratostene sieve.

Option 2: copress your data using a bit instead of an int. You can do this easily with vector<bool> which is designed to exactly that.

With option 2 you can keep your code as it is, just changing arr to:

vector<bool> arr(m+1);

You could even save yourself the initialisation loop with:

vector<bool>arr(m+1, false);

Whatever option you take, I'd suggest to replace the inner loop of your eratostene sieve with an additive version:

for (int i = 2; i <= m; i++){    
    if (arr[i] == 0)
    {
        for (int j = i*2; j <= m; j+=i)  // no p, no multiplication, just add i at each iteration
        {
            arr[j] = 1;
        }
    }
}

Time

Save space increases time. However I doubt that even if you had enough memory that you'd get an answer within 2 secs for your so huge numbers. If you have to process 2 000 000 000 numbers it would require that each number is processed in maximum 1 nanosecond, i.e. one single CPU instruction to meet your expectations.

edited Oct 18 '14 at 15:15

answered Oct 18 '14 at 14:59

Christophe

68,716
7
72
138

`vector` slows down things when implemented as bitfield because you cannot address single bits, only bytes. By packing more values in a single byte, you add a selection overhead. In fact, first you have to find which byte contains the desired bit (which mostly involves one subtraction and one bitwise shift), then read it and finally mask the desired bit. – Stefano Sanfilippo Oct 18 '14 at 15:21
That's why I speak of space vs. time tradeoff. There is no mystery, either you have the space necessary, or you pay the price of not having it either by bit shifting/oring/anding or by not storing elements in the array. For instance, if you don't store the eratorstene sieve for the 1000 first numbers, you have to loop throug you 200000000000 numbers approx a thousand times more. – Christophe Oct 18 '14 at 15:28
Not necessarily. If you used `char`/`bool` (not `vector`, as some genius thought a bitfield would be a great idea) instead of `int` on x86, you will have reduced the allocated memory to 1/4 with no noticeable performance hit. An hypothetical machine able to address memory at bit level (x86 is not such a machine) will not even notice any difference with bitfields. – Stefano Sanfilippo Oct 18 '14 at 16:15
The question is about pushing the limits. With the same amount of memory, with char you multiply by four the max number that you are able to handle. With a vector you multiply 32. Don't misunderstand me: I agree with you, char is more efficient than a bitfield. But what if it would not be sufficient ? How could we push limits further ? – Christophe Oct 18 '14 at 21:43

score 0 · Answer 3 · edited May 23 '17 at 12:27

I suggest changing the algorithm using wheel factorization.

As described in the first paragraph of the provided link, you can determine if a number is a prime by dividing 2,3,5 and then divide by numbers congruent to modulo 30 until the square root of the number.

One such implementation is listed below. By checking for primes this way, you won't need to create arrays up to m. The only array you'll need is wheelOf30[].

#include <math.h>

static const bool wheelOf30[] = {
    false,  true, false, false, false, false, false,  true, false, false,  //  1,  7,
    false,  true, false,  true, false, false, false,  true, false,  true,  // 11, 13, 17, 19,
    false, false, false,  true, false, false, false, false, false,  true   // 23, 29.
};

bool isPrime(int num)
{
    // first, modulo by 2, 3, and 5
    if ((num % 2 == 0) || (num % 3 == 0) || (num % 5 == 0))
        return false;

    int rootNum = sqrt(num);
    // and then, divide by the congruent numbers represented by wheel of 30
    for (int i = 7; i <= rootNum; i += 2)
    {
        // divide the number with the ones congruency to modulo 30 is true
        if (wheelOf30[ i % 30 ])
        {
            if (num % i == 0)
                return false;
        }
    }

    return true;
}

Your main program would look like this:

#include <iostream>

extern bool isPrime(int num);

int main(int argc, const char * argv[])
{
    int n, m;
    cin >> n >> m;

    for (int i=n; i<=m; ++i)
        if (isPrime(i))
            cout << i << endl;
}

C++ prime numbers interval

3 Answers3