2

I've all prime numbers that can be stored in 32bit unsigned int and I want to use them to generate some 64bit prime numbers. using trial division is too slow even with optimizations in logic and compilation.

I'm trying to modify Sieve of Eratosthenes to work with the predefined list, as follow:

  1. in array A from 2 to 4294967291
  2. in array B from 2^32 to X inc by 1
  3. find C which is first multiple of current prime.
  4. from C mark and jump by current prime till X.
  5. go to 1.

The problem is step 3 which use modulus to find the prime multiple, such operation is the reason i didn't use trail division.

Is there any better way to implement step 3 or the whole algorithm.

thank you.

FuzzyBunnySlippers
  • 3,387
  • 2
  • 18
  • 28
Muhammad
  • 1,598
  • 3
  • 19
  • 31
  • How many primes do you need? – RichardPlunkett Dec 24 '13 at 14:47
  • all primes to `0xFFFFFFFFFF`, and i want to calculate them. – Muhammad Dec 24 '13 at 14:49
  • 3 strikes me as O(1) per prime, while 4 strikes me as more like O(X) per prime, can you expand on why it is you see 3 as a problem? – RichardPlunkett Dec 24 '13 at 14:50
  • 1
    0xFFFFFFFFFF is only 40 bits. Thats an awful lot less than 64 bits. – RichardPlunkett Dec 24 '13 at 14:51
  • 1
    3 only needs constant time. And if your upper limit is 0xFFFFFFFFFF, you only need primes less than 0xFFFFF in the seive. – Buddhima Gamlath Dec 24 '13 at 14:52
  • If you only want40 bits, only sieve primes up to 2^20. Also, sieve really isnt much faster if you already know primes, the finding primes part is really cheap, its the iterating them over the sieve's range that's expensive. it might be easier to just sieve from the base. – RichardPlunkett Dec 24 '13 at 14:54
  • You only need 20 bits of primes then, not 32. How much storage ya got? – Yakk - Adam Nevraumont Dec 24 '13 at 14:54
  • 120GB of disk storage, 8GB of memory, and the `0xFFFFFFFFFF` is the limit to test the algorithm speed, that's why i said the step 3 will take more time because of division and step 4 is just to mark. that what is in my mind, I'm not good algorithm time estimation. – Muhammad Dec 24 '13 at 15:02

1 Answers1

5

Increment by 2, not 1. That's the minimal optimization you should always use - working with odds only. No need to bother with the evens.

In C++, use vector<bool> for the sieve array. It gets automatically bit-packed.

Pre-calculate your core primes with segmented sieve. Then continue to work by big enough segments that fit in your cache, without adding new primes to the core list. For each prime p maintain additional long long int value: its current multiple (starting from the prime's square, of course). The step value is twice p in value, or p offset in the odds-packed sieve array, where the i-th entry stands for the number o + 2i, o being the least odd not below the range start. No need to sort by the multiples' values, the upper bound of core primes' use rises monotonically.

sqrt(0xFFFFFFFFFF) = 1048576. PrimePi(1048576)=82025 primes is all you need in your core primes list. That's peanuts.

Integer arithmetics for long long ints should work just fine to find the modulo, and so the smallest multiple in range, when you first start (or resume your work).

See also a related answer with pseudocode, and another with C code.

Community
  • 1
  • 1
Will Ness
  • 70,110
  • 9
  • 98
  • 181
  • Will this algorithm need modification for finding all 64bit primes? – Muhammad Dec 24 '13 at 15:20
  • just more space and time. Obviously, more core primes. There are N primes below 2^32, [N = 203,280,221](http://www.wolframalpha.com/input/?i=PrimePi%5B2%5E32%5D). That's must harder. :) – Will Ness Dec 24 '13 at 15:22
  • there is 203280221 prime below 2^32, they are my core prime list. – Muhammad Dec 24 '13 at 15:23
  • 1
    8 bytes for `long long` multiple per int (4 bytes) prime, 12*204 mil = 2.5 GB memory. You have 8. So, slowly, this should work, segment by segment (~ 100K). :) Write them out to a file as you go. – Will Ness Dec 24 '13 at 15:26
  • 1
    BTW, if you sieve up to 2^64, it is beneficial not to pre-calculate all the core primes at once, but add them as you go. this will be just a usual segmented sieve then. For memory contiguousness, to store the core primes' info, I'd use an indexed array of blocks `long long arr[10,000,000]` (x12, in bytes) or so. See [HAT](https://en.wikipedia.org/wiki/Hashed_array_tree). No need to resize the blocks, just preallocate the index array from the start, and keep track of how many of them are full. Read the article, you'll see what I mean. Cheers. – Will Ness Dec 24 '13 at 15:34
  • 2
    That is a huge amount of machine time to sieve **all** the primes to 2^64: Using the best maximally wheel factorized Sieve of Eratosthenes, that's about 8 X 10^18 composite culling operations; even at "primesieve" speed of about 2.5 CPU clocks per cull, that's about 180 core-years to complete. Even multi-processing won't help much here unless one has access to a super-computer and is assigned the use of a hundred thousand cores for a day or so. As to storing all of the results on a hard disk, even compressed using prime gaps means about **200,000 Terabytes!!** Perhaps a sub range would do? – GordonBGood Apr 01 '14 at 05:39