8

Can anyone tell me how to implement Sieve of Eratosthenes algorithm in C? I need to generate prime numbers but my algorithm is slow.

My code:

#include <stdio.h>

int prime(long int i)
{
    long int j;
    int state = 1;
    for(j=2;j<i;j++)
    {
        if((i%j)==0){state=0;break;}
    }
    return state;
}

int main()
{
    int t;
    long int m,n,i;
    scanf("%d", &t);
    while(t--) {
        scanf("%d %d", &m,&n);
        for(i=m;i<=n;i++)
        {
            if(i==1){
                //do nothing for 1
            } else{
                if(prime(i))printf("%d\n",i);
            }
        }
    }
    return 0;
}

t is the number of test cases m and n is the range between which prime are to be printed.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
jaykumarark
  • 2,359
  • 6
  • 35
  • 53
  • Yes, the straightforward Sieve is slow. If you post what you have so far, maybe someone can help you improve it. – aschepler Jan 26 '11 at 19:32
  • 6
    5 seconds of googling: http://www.dreamincode.net/code/snippet3315.htm – Marc B Jan 26 '11 at 19:33
  • What is the maximal integer you want to have as a result? – Benoit Jan 26 '11 at 19:34
  • Look at the bottom of wiki page you pointed in your post and you will find some link to implementation of Sieve of Eratosthenes alogrithm in C – DReJ Jan 26 '11 at 19:35
  • 3
    @aschepler: particularly slow when barring every even number till infinity, then barring every multiple of 3 till infinity, etc… that can be damn slow, it is still running and i don't understand why. – Benoit Jan 26 '11 at 19:35
  • your code is not a sieve at all but just a sub-optimal trial division loop with ***n^2*** complexity. limiting the testing to sqrt of the tested number will make the complexity ***n^1.5***. testing by only primes and not all numbers like you do will bring cpxty down by another log factor or so. but the sieve of Eratosthenes' complexity is ***n log log n***. – Will Ness Jul 18 '20 at 05:22

7 Answers7

12

You need to create an array of booleans as big as the maximum prime number you want to find. At the beginning it's completely initialized to true.

The ith cell of such array will be true if i is a prime number, or false if it's not.

Start iterating from i=2: it's prime, then set to false any cell with an index multiple of 2. Go to the next prime number (i=3) and do the same. Go to the next prime (it's i=5: i=4 is not prime: array[4] was set to false while processing i=2) and do the same again and again.

peoro
  • 25,562
  • 20
  • 98
  • 150
  • 1
    when you are testing the i-th number, why don't step by i? (counter += i) – BlackBear Jan 26 '11 at 19:41
  • @BlackBear: you cannot. If you are at `i=2` and go to `4` you are skipping `3`... Anyway you could find some optimizations similar to the one you proposed to quickly _move to the next prime number_, but I don't think they could improve the complexity of your algorithm (not even your one does). – peoro Jan 26 '11 at 19:45
  • @BlackBear yes, you are correct! iteratively incrementing the index ( by counter += whatever ) is ***exactly*** the ***essence*** of the sieve of Eratosthenes. that's how "set to false any cell with an index multiple of 2" ***must*** be done. if each of the indices is instead tested for divisibility to discover whether it's a multiple, it will be a ***trial division*** sieve, which has ***much worse*** time complexity. – Will Ness Jul 18 '20 at 05:09
  • so this answer is silent on the crucial point about the algorithm, and the author's suggestion in the comments would make it simply wrong. – Will Ness Jul 18 '20 at 05:15
6

In my opinion, your algorithm slow because you calculate the inessential number. try this code

int isPrime(int number){

    if(number < 2) return 0;
    if(number == 2) return 1;
    if(number % 2 == 0) return 0;
    for(int i=3; (i*i)<=number; i+=2){
        if(number % i == 0 ) return 0;
    }
    return 1;

}
Toomtarm Kung
  • 594
  • 2
  • 7
  • 11
5

Here is actually very simple code that use the Sieve of Eratosthenes algorithm. works for all positive int.

int is_prime(int n){
  int p;
  for(p = 2; p < n; p++){
    if(n % p ==0 && p != n)
      return 0;    
  }
  return 1;
}
  • Nice, I upvoted your answer. Tips for edit it: Your definition of prime is wrong and `#include ` is superfluous. – MarianD Mar 05 '17 at 02:26
  • no, this is **not** the Sieve of Eratosthenes. **this is a very inefficient implementation of trial division primality test.** – Will Ness Jul 18 '20 at 09:15
3

Marc B's link shows a nice and simple algorithm which is correct, written by NSLogan. I wrote a slight modification to it to show some size/speed tradeoffs. I thought I'd share for interest's sake.

First, the code:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <assert.h>
#include <time.h>

#define USE_BITS

#ifdef USE_BITS
#define alloc_prime char *prime = calloc(i/8+1,sizeof(*prime));
#define set_not_prime(x) prime[x/8]|= (1<<(x%8))
#define is_prime(x) (!(prime[x/8]&(1<<(x%8))))
#endif

#ifdef USE_CHAR
#define alloc_prime char *prime = calloc(i+1,sizeof(*prime));
#define set_not_prime(x) prime[x] = 1
#define is_prime(x) (prime[x] == 0)
#endif

#ifdef USE_SIZE_TYPE
#define alloc_prime size_t *prime = calloc(i+1,sizeof(*prime));
#define set_not_prime(x) prime[x] = 1
#define is_prime(x) (prime[x] == 0)
#endif

 
int main(){
    int i;
    printf("Find primes up to: ");
    scanf("%i",&i);
 
    clock_t start, stop;
    double t = 0.0;
       
    assert((start = clock())!=-1);
       
    //create prime list
    alloc_prime;
    int c1, c2, c3;
    if(!prime){
    printf("Can't allocate %zu bytes.\n",i*sizeof(*prime));
    exit(1);
    }
       
    //set 0 and 1 as not prime
    set_not_prime(0);
    set_not_prime(1);

    //find primes then eliminate their multiples (0 = prime, 1 = composite)
    for(c2 = 2;c2 <= (int)sqrt(i)+1;c2++){
      if(is_prime(c2)){
        c1=c2;
        for(c3 = 2*c1;c3 <= i+1; c3 += c1){
          set_not_prime(c3);
        }
      }
    }
       
    stop = clock();
    t = (double) (stop-start)/CLOCKS_PER_SEC;
       
    //print primes
    for(c1 = 0; c1 < i+1; c1++){
        if(is_prime(c1))printf("%i\n",c1);
    }
    printf("Run time: %f\n", t); //print time to find primes

    return 0;
}

Since this uses sqrt, to compile use: gcc prime.c -lm

I've compared three different ways of storing the boolean variables that peoro suggested. One method actually uses bits, the second takes an entire byte, and the last uses an entire machine word. A naive guess about which is fastest might be the machine word method, since each flag/boolean is dealt with more 'naturally' by your machine. The timings to calculate the primes up to 100,000,000 on my machine were as follows:

Bits: 3.8s Chars: 5.8s M-words: 10.8s

It is interesting to note that even all the ugly bit-shifting necessary to represent a boolean with only one bit is still faster overall. My conjecture is that caching and locality-of-reference seem to outweigh the extra ~3 instructions. Plus, you end up using 1/nth the memory of the n-bit-machine-word method!

Will Ness
  • 70,110
  • 9
  • 98
  • 181
Rooke
  • 2,013
  • 3
  • 22
  • 34
1

Though it's very old post, Following is my try to generate the prime number using "Sieve of Eratosthenes" algorithm.

#include <stdio.h>

#define NUM 8000        /* Prime Numbers in the Range.  'Range + 2' actually. */

int main()
{
  int a[NUM] = {0};         /* Array which is going to hold all the numbers */
  int i , j;

  /* initializing array from 2 to given number + 2, assuming the first prime number is 2 */
  for(i = 2,j=0; i < NUM+2, j<NUM; i++, j++) 
  {
    a[j] =i;
  }


  for(i = 0; i < NUM; i++ ) 
  {
    int num = a[i];

    /* If number is not 0 then only update the later array index. */
    if(num != 0) 
    {
      for (j = i+1; j < NUM; j++) 
      {
        if( (a[j]%num == 0) ) 
        {
            a[j]=0;
        }
      }
    }
  }


  for(i = 0; i < NUM; i++) 
  {
    /* Print all the non Zero *Prime numbers* */
    if(a[i] != 0) 
    {
      printf("%d \n", a[i]);
    }
  }

}

Hope this will help someone.

pankaj
  • 21
  • 2
  • [no](https://stackoverflow.com/questions/4809051/prime-number-algorithm/4833217#comment111346417_4809107), this is *not* the sieve of Eratosthenes. it is a trial division sieve. has worse complexity than the s. of E (runs much slower, the bigger the NUM the slower it gets). – Will Ness Jul 18 '20 at 09:10
1

Fist step is to recognize that it's trivial to split it into arbitrary sized blocks; and that you do NOT need an array (or bitfield) for every number. For example, if you only care about numbers from 100000 to 110000, then to mark all numbers that are divisible by 3 as "not prime" you can do "for( index = 3 * (100000+3-1)/3; index < 110000; index += 3) {". For a more flexible example:

    for( value = 2; value < sqrt( block_end_value-1); value++ ) {
        for( index = value  * (block_state_value+value -1)/value; index < block_end_value; index += value ) {
            mark_not_prime(index - block_state_value);
        }
    }

The second step is to realize that you don't need to care about every number (and the for( value = 2; value < sqrt( block_end_value-1); value++) above is inefficient). For example, if you've already marked numbers that are divisible by 2 as "not prime" then there is no reason to care if numbers are divisible by 4, 6 or 8; and if you've already marked numbers that are divisible by 3 as "not prime" then there is no reason to care if numbers are divisible by 6, 9 or 12; and ... Essentially you only want to test if a number is divisible by another prime number. This means that to find prime numbers in the range 100000 to 110000 you first want to find prime numbers in the range 0 to sqrt(110000); and if you want to find prime numbers in the range 0 to sqrt(110000) you want want to find prime numbers in the range 0 to sqrt(sqrt(110000)); and ....

The third step is to realize that it can be accelerated by copying repeating patterns. You can create a 2-bit pattern (representing "is divisible by 2") and copy those 2 bits everywhere. In the same way you can create a 6-bit pattern (representing "is divisible by 2 or 3") and copy those 6 bits everywhere. In the same way you can create a 39916800-bit pattern (representing "is divisible by 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11") and copy that 39916800-bit pattern everywhere. Of course nothing prevents you from pre-generating a pattern and storing it in your program's code.

The fourth step is to realize that multiples of 2 are too trivial to store and by not storing them you halve the memory consumption of all tables and patterns (and any stored/pre-generated pattern).

By combining the third and fourth steps above; a pre-generated pattern representing "is divisible by 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11" will cost 19958400 bits, or about 2.38 MiB. On its own the first part of this pattern will also be usable for finding prime numbers from 1 to the first prime number that is larger than 11 (which in this case would be numbers from 1 to 13).

The fifth step is to realize that if you already have a pattern you can use it to find "N = the next "not marked yet" prime number", copy the existing pattern N times, then mark multiples of N as "not prime"; and end up with a larger pattern. For example; if you have a pattern representing "is divisible by 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11", you'd skip over 12 (because its not prime according to the existing pattern); copy the pattern 13 times, then mark the numbers that are divisible by 13 as "not prime" and end up with a pattern representing "is divisible by 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 13".

The sixth step is to realize that once you have a pattern large enough for the range you want, you can fill in the missing divisors without copying. For example, if you only want prime numbers in the range from 1 to 6227020800; then you can take a pattern representing "is divisible by 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 13" and then mark numbers that are divisible by prime numbers in the range from 14 to 6227020800 as "not prime".

By combining all of the above; if you want to find prime numbers in the range 1000000000 to 11000000000; then you'd start by finding prime numbers in the range 1 to sqrt(11000000000); and to do that you'd copy a pre-generated pattern and extend it until you have a table that would be large enough to represent prime numbers in the range 1 to sqrt(11000000000); then copy that extended pattern and fill in the missing numbers to generate a table representing prime numbers in the range 1 to sqrt(11000000000); then create a table for prime numbers in the range 1000000000 to 11000000000 and initialise the memory by copying data from the "extended pre-generated pattern" into it, then use the table of prime numbers in the range 1 to sqrt(11000000000) to find prime numbers that haven't already been incorporated into the "extended pre-generated pattern" to find numbers that still need to be marked as "not prime" that you need to generate the table for numbers from 1000000000 to 11000000000.

Brendan
  • 35,656
  • 2
  • 39
  • 66
0

Toomtarm Kung' answer is great, thank's a lot.

There's however still one caveat I stumbled upon: (i*i) may overflow for i>46340 on 32 bit and i>3037000499 on 64-bits, producing incorrect results, i.e. 2147483647 will not be recognised as a prime when using 32-bit integers.

To avoid the integer overflow, one can replace (i * i)<=number by i <= number / i

Now, to avoid a double division, the code may be written as follows:

int isPrime(int number){
    if(number < 2) return 0;
    if(number == 2) return 1;
    if(number % 2 == 0) return 0;
    int j;
    for(int i=3; ((j=number/i) >= i); i+=2){
        if(number - (j * i) == 0 ) return 0;
    }
    return 1;
}
joheirba
  • 1
  • 1