-1

I am working on a program which prints the number of pairs of prime numbers (within a range L and R) whose difference is 6, where 2 <= L < R <= 10^9.

eg.,

L=2 R=20
Ans:  4
Explanation: 4 pairs are there: {5, 11}, {7, 13}, {11, 17}, {13, 19}

So my approach is to first store all the prime numbers within the range and then find all the pairs.

In worst case, L=2 and R=10^9; To get all the prime numbers I am using sieve of eratosthenes, but its time complexity is O(nloglogn) which is a problem and another problem is of space (space complexity is O(n)) [array of size 10^9 can not be there]. I am running this program over online IDE.

#include <bits/stdc++.h>
#define siz 1000000000
using namespace std;

int main()
{
    vector<bool> prime(siz);
    int i, j;
    for(i=0;i<siz;i++)
    {
        prime[i]=false;
    }
    for(i=2;i*i<=siz;i++)
    {
        if(prime[i]==false)
        {
        for(j=i*i;j<=siz;j+=i)
        {
            prime[j]=true;
        }
        }
    }
    int ans=0;
    for(i=0;i<siz;i++)
    {
        if(prime[i]==false)
        {
            ans++;
        }
    }
    cout<<ans;
    return 0;
}

Above program is a part of the whole program where I am just counting the number of prime numbers till 10^9 and the message by compiler is SIGTSTP and when I am reducing the siz to 10^8, it's running fine and the output is 5761455 which is correct (I have verified it from wikipedia).

joanis
  • 10,635
  • 14
  • 30
  • 40
  • 1
    1. O(n Log n) is not too far away from O(n). See my cute (if I may say so myself) answer to https://stackoverflow.com/questions/56952527/why-is-sorting-not-taking-on-log-n-in-time/56952664#56952664 – Bathsheba Aug 27 '19 at 12:12
  • 2
    2. You don't need to store all the "no" cases. So no need for an array - just a list of yes cases. Methinks though the whole lot could fit in a `std::vector` unless you're coding on a ZX81. – Bathsheba Aug 27 '19 at 12:13
  • 5
    _array of size 10^9 cannot be there_ Why not? Stack too small? You could make this array `static` or use `std::vector` to put it on heap. Additionally, you could consider to store booleans as bits. (The latter would reduce the needed memory to 1/8.) Btw. `std::vector` is a specialization which has this feature built-in. (Mostly, it's considered as annoying but in your case it would pay off.) ;-) – Scheff's Cat Aug 27 '19 at 12:13
  • @Bathsheba I am using int which is capable of storing 1e9 – Sannidhya Raj Chauhan Aug 27 '19 at 12:24
  • The loglog function grows extremely slowly - it is almost indistinguishable from a constant. I suspect some other problem in your sieve implementation. – molbdnilo Aug 27 '19 at 12:31
  • @Bathsheba If I have to store all the yes cases, then also it would require to iterate over all the 1e9 numbers which would give timeout [would take almost 10 seconds] and I am using **bool prime[10^9]** which is giving runtime error and when size of array is reduced it is working fine. – Sannidhya Raj Chauhan Aug 27 '19 at 12:33
  • @Scheff I am running this program on online IDE. I am using *std: :vector* yet its giving runtime error for larger range but is working fine for smaller ones. – Sannidhya Raj Chauhan Aug 27 '19 at 12:38
  • I expected such reply somehow. ;-) However, I think this is an essential limitation worth to be [edit]ed into the question. Is it required that code must run with R = 1e9 in online compiler? – Scheff's Cat Aug 27 '19 at 12:39
  • @molbdnilo constant is indistinguishable but yet in O(n) too it would be very slow. Since all the online judges allow 10^8 iterations per second it would take 10 sec. – Sannidhya Raj Chauhan Aug 27 '19 at 12:43
  • At least, coliru seems to allow a [std::vector(1e9)](http://coliru.stacked-crooked.com/a/3aeb34d1de846417) but, of course, TLE is another topic. – Scheff's Cat Aug 27 '19 at 12:47
  • `bool prime[10^9]` will take *much* more space than `std::vector`: the latter has a bit-by-bit specialisation. The former is probably as big as an `int prime[]`. – Bathsheba Aug 27 '19 at 12:49
  • @SannidhyaRajChauhan *I am using std: :vector yet its giving runtime error for larger range* -- Which could mean you are accessing a value out-of-bounds, thus your program was always broken, even for smaller values. Use `at()` when accessing the elements of the vector instead of `[ ]` – PaulMcKenzie Aug 27 '19 at 12:49
  • @Scheff yes it is required. Please specify if there is any method more efficient than sieve of eratosthenes. – Sannidhya Raj Chauhan Aug 27 '19 at 12:51
  • @Scheff: Sadly not true, sizeof(bool) is implementation defined in C++. – Bathsheba Aug 27 '19 at 12:51
  • 1
    @Bathsheba Damn. – Scheff's Cat Aug 27 '19 at 12:52
  • @Scheff: Whoever came up with `bool` needs a stern talking to. – Bathsheba Aug 27 '19 at 12:52
  • I just realized that there is no need to store even numbers. The only even prime is 2. So, you can reduce memory to half additionally. – Scheff's Cat Aug 27 '19 at 12:53
  • If you want to out-smart the online judge (and there are no constraints on compile time), then you could use the Erwin Unruh metaprogramming technique to generate the primes, then the runtime is reduced to spotting the pairs. – Bathsheba Aug 27 '19 at 12:56
  • I have no idea how to beat Sieve of Eratosthenes but I did [google "better than sieve of eratosthenes"](https://www.google.com/search?q=better+than+sieve+of+eratosthenes). The other question is whether the results of alternative algorithms can be easily searched for pairs like the outcome of the former. – Scheff's Cat Aug 27 '19 at 12:59
  • @Scheff: I think the Atkin algorithm (similar to Eratosthenes) has a faster implementation, but I suspect the Eratosthenes approach is sufficient and it's the implementation that's defective in this case. – Bathsheba Aug 27 '19 at 13:00
  • 1
    Use a segmented Sieve of Eratosthenes. See [here](http://stackoverflow.com/a/10249801). – user448810 Aug 27 '19 at 13:13
  • @Bathsheba please consider my code I have added it. And please suggest any modification. – Sannidhya Raj Chauhan Aug 27 '19 at 13:40
  • @Scheff If we have to store only odd numbers then how will we implement sieve because in sieve, index number is equal to the value of number which helps in direct access and if we are storing only odd numbers then it will take more time in searching . Correct me if I am wrong, please. – Sannidhya Raj Chauhan Aug 27 '19 at 13:47
  • Seriously? When iterating, just skip indices with even value, and divide indices with odd value by 2 to get effective index into array (or vector). Of course, be carefully concerning off-by-one-error... ;-) – Scheff's Cat Aug 27 '19 at 13:59
  • I tried to implement the Sieve of Eratosthenes (I never did before): [**Live Demo on coliru**](http://coliru.stacked-crooked.com/a/abf46d7dccd88417). I believe I did it correctly - the first primes looked OK, the higher I don't remember but you might check with a table if in doubt. – Scheff's Cat Aug 27 '19 at 14:19
  • I disagree that this question is too broad. I think it is clear exactly what is being asked and it is very specific. I would agree there might not be an answer. – Marlin Pierce Aug 27 '19 at 14:25
  • An improved version (with `if` removed from inner loop): [**Live Demo on coliru**](http://coliru.stacked-crooked.com/a/314cc08282f14d29) – Scheff's Cat Aug 27 '19 at 14:29
  • The best case I can imagine, is doing the sieve of eratosthenes up to square root of R, which could be as high as 31622 to find all the primes you need to consider as factors. Then do the sieve of eratosthenes from L to R. You can do bit masking to treat bytes as a bit array, and you can omit storing even numbers as mentioned above, so a bit array up to square root R could be as much as 1976 bytes. The processing time as R is close to 10^9 will still be high. – Marlin Pierce Aug 27 '19 at 14:36
  • @Scheff thanks for the efforts you put in writing the code, but the earlier program and the later one too is running properly till 10^8 and for 10^9 again the same message as **SIGTSTP**. – Sannidhya Raj Chauhan Aug 27 '19 at 14:37
  • @MarlinPierce thanks for the support as this is my first question I did not know much about it. As I added the code a message saying *too broad* was displayed so I deleted it. – Sannidhya Raj Chauhan Aug 27 '19 at 14:43
  • @SannidhyaRajChauhan The question itself is too broad for SO moderation rules, but it's very helpful to have the code. I suggest you revert that edit and put the code back in. I can assure you it was not marked too broad because of you adding the code; it was probably just a coincidence that the third vote to close it came just after you added the code. The question is much better with the code. – joanis Aug 27 '19 at 15:04
  • Note that the elements of the vector are initialised to false, no need for you to do that. And I've cast my reopen vote. – Bathsheba Aug 27 '19 at 16:21
  • my simple non-segmented SoE on odds gets to 10^9 in under 4 secs [on ideone](https://ideone.com/fapob). you can tweak it to count the pairs instead. if it's still too slow, do make it [segmented](https://stackoverflow.com/a/10249801/849891). – Will Ness Aug 27 '19 at 17:20
  • @Bathsheba no, according to the [accepted wisdom](https://stackoverflow.com/search?q=user%3A549617+atkin) sieve of Atkin is never faster than properly wheeled (i.e. mod-210 at least) sieve of Eratosthenes (esp. when *segmented*). – Will Ness Aug 27 '19 at 17:35

2 Answers2

1

You are trying to find the sexy primes (Wikipedia, MathWorld, OEIS) less than a billion; mathematicians call them sexy because they differ by six, which is sex in Latin. There are 6,849,047 such primes, according to OEIS.

There is no need to store all of the primes less than a billion; you only ever need three of them. You can enumerate the primes by a segmented sieve or by a generator. Both of those methods are given in a recent discussion here on Stack Overflow. That code is in Python, so you will have to port it to C++.

I used the code in Pari that is also mentioned at that discussion to calculate the sexy primes on my cell phone, finding them in 62 seconds. Who knew a cell phone had so much power!

user448810
  • 17,381
  • 4
  • 34
  • 59
0

Kim Walisch has a fast C++ prime generator here which is pretty straight forward. It shouldn't be hard to adapt it to solve the difference of 6 problem. Kim has another version which is faster and far more complex to handle really big sized prime problems.

Greg Ames
  • 146
  • 4