First, a little background: I enjoy working on project Euler problems (https://projecteuler.net/archives) but many of them require a ton of heavy computation so I try to save known constants in memory so they don't have to be recalculated every time. These include things like n!, nPr, nCr, and lists of primes. For the purpose of this question let's just stick with primes because any solution with those can be easily ported to the others.
The question: Let's say I want to save the first 1,000,000 primes in memory for repeated access while doing heavy computation. The 1,000,000th prime is 15,485,863 so ints will do just fine here. I need to save these values in a way such that access is O(1) because these will be access a lot.
What I've tried so far:
Clearly I can't put all 1,000,000 in one cs file because Visual Studio throws a fit. I've been trying to break it into multiple files using a partial class and 2-D List<List<int>>
public partial class Primes
{
public readonly List<int> _primes_1 = new List<int>
{
2, 3, ... 999983
}
}
So _primes_1
has the primes less than 1,000,000, _primes_2
has the primes between 1,000,000 to 2,000,000, etc, 15 files worth. Then I put them together
public partial class Primes
{
public List<List<int>> _primes = new List<List<int>>()
{
_primes_1, _primes_2, _primes_3, _primes_4, _primes_5,
_primes_6, _primes_7, _primes_8, _primes_9, _primes_10,
_primes_11, _primes_12, _primes_13, _primes_14, _primes_15
};
}
This methodology does work as it is easy to enumerate through the list and IsPrime(n)
checks are fairly simple as well (binary search). The big downfall with this methodology is that VS starts to freak out because each file has ~75,000 ints in it (~8000 lines depending on spacing). In fact, much of my editing of these files has to be done in NPP just to keep VS from hanging/crashing.
Other things I've considered: I originally read the numbers in off a text file and could do that in the program but clearly I would want to do that at startup and then just have the values available. I also considered dumping them into sql but again, eventually they need to be in memory. For the in memory storage I considered memcache but I don't know enough about it to know how efficient it is in look ups.
In the end, this comes down to two questions:
How do the numbers get in to memory to begin with?
What mechanism is used to store them?
Spending a little more time in spin up is fine (within reason) as long as the lookup mechanism is fast fast fast.
Quick note: Yes I know that if I only do 15 pages as shown then I won't have all 1,000,000 because 15,485,863 is on page 16. That's fine, for our purposes here this is good enough.