Assuming you are sure that your OutOfMemoryException could be avoided, and your approach of having it all in memory isn't the actual issue (the GC is pretty good at stopping this happening if memory is available) ...
- Here is your first problem. I'm not sure the CLR supports any single object larger than 2 GB.
- Secondly you are talking about "some are kept some are GC'd". i.e. you want to be able to reallocate elements of your array once you are done with a "child array".
- Thirdly, I'm assuming that the
PointF[] giganticList = new PointF[100];
in your question is meant to be more like PointF[] giganticList = new PointF[1000000];
?
Also consider using MemoryFailPoint
as this allows you to "demand" memory and check for exceptions instead of crashing with OutOfMemoryException.
EDIT Perhaps most importantly you are now entering a land of trade-offs. If you do this you could start losing the advantages of things like the jitter optimising for
loops by doing array bound checks at the start of the loop (for (int i= 0; i < myArray.Length; i++)
gets optimised, int length = 5; for (int i= 0; i < length; i++)
doesn't). If you have high computation resource code, then this could hurt you. You are also going to have to work far harder to process different child arrays in parallel with each other as well. Creating copies of the child arrays, or sections of them, or even items inside them, is still going to allocate more memory which will be GC'd.
This is possible by wrapping the array, and tracking which sections are used for which child arrays. You are essentially talking about allocating a huge chunk of memory, and then reusing parts of it without putting the onus on the GC. You can take advantage of ArraySegment<T>
, but that comes with its own potential issues like exposing the original array to all callers.
This is not going to be simple, but it is possible. Likely as not each time you remove a child array you will want to defragment your master array by shifting other child arrays to close the gaps (or do that when you have run out of contiguous segments).
A simple example would look something like the (untested, don't blame me if your computer leaves home and your cat blows up) pseudocode below. There are two other approaches, I mention those at the end.
public class ArrayCollection {
List<int> startIndexes = new List<int>();
List<int> lengths = new List<int>();
const int 1beeellion = 100;
PointF[] giganticList = new PointF[1beeellion];
public ArraySegment<PointF> this[int childIndex] {
get {
// Care with this method, ArraySegment exposes the original array, which callers could then
// do bad things to
return new ArraySegment<String>(giganticList, startIndexes[childIndex], length[childIndex]);
}}
// returns the index of the child array
public int AddChild(int length) {
// TODO: needs to take account of lists with no entries yet
int startIndex = startIndexes.Last() + lengths.Last();
// TODO: check that startIndex + length is not more than giganticIndex
// If it is then
// find the smallest unused block which is larger than the length requested
// or defrag our unused array sections
// otherwise throw out of memory
startIndexes.Add(startIndex); // will need inserts for defrag operations
lengths.Add(length); // will need inserts for defrag operations
return startIndexes.Count - 1; // inserts will need to return inserted index
}
public ArraySegment<PointF> GetChildAsSegment(int childIndex) {
// Care with this method, ArraySegment exposes the original array, which callers could then
// do bad things to
return new ArraySegment<String>(giganticList, startIndexes[childIndex], length[childIndex]);
}
public void SetChildValue(int childIndex, int elementIndex, PointF value) {
// TODO: needs to take account of lists with no entries yet, or invalid childIndex
// TODO: check and PREVENT buffer overflow (see warning) here and in other methods
// e.g.
if (elementIndex >= lengths[childIndex]) throw new YouAreAnEvilCallerException();
int falseZeroIndex = startIndexes[childIndex];
giganticList[falseZeroIndex + elementIndex];
}
public PointF GetChildValue(int childIndex, int elementIndex) {
// TODO: needs to take account of lists with no entries yet, bad child index, element index
int falseZeroIndex = startIndexes[childIndex];
return giganticList[falseZeroIndex + elementIndex];
}
public void RemoveChildArray(int childIndex) {
startIndexes.RemoveAt(childIndex);
lengths.RemoveAt(childIndex);
// TODO: possibly record the unused segment in another pair of start, length lists
// to allow for defraging in AddChildArray
}
}
Warning The above code effectively introduces buffer overflow vulnerabilities if, for instance, you don't check the requested childIndex
against length
for the child array in methods like SetChildValue
. You must understand this and prevent it before trying to do this in production, especially if combining these approaches with use of unsafe
.
Now, this could be extended to return psuedo index public PointF this[int index]
methods for child arrays, enumerators for the child arrays etc., but as I say, this is getting complex and you need to decide if it really will solve your problem. Most of your time will be spent on the reuse (first) defrag (second) expand (third) throw OutOfMemory
(last) logic.
This approach also has the advantage that you could allocate many 2GB subarrays and use them as a single array, if my comment about the 2GB object limit is correct.
This assumes you don't want to go down the unsafe
route and use pointers, but the effect is the same, you would just create a wrapper class to manage child arrays in a fixed block of memory.
Another approach is to use the hashset/dictionary approach. Allocate your entire (massive 2GB array) and break it into chunks (say 100 array elements). A child array will then have multiple chunks allocated to it, and some wasted space in its final chunk. This will have the impact of some wasted space overall (depending on your average "child length vs. chunk length" predictions), but the advantage that you could increase and decrease the size of child arrays, and remove and insert child arrays with less impact on your fragmentation.
Noteworthy References:
Other examples of accessing arrays as a different kind of array or structure. The implementations of these might help you develop your own solution
Array optimisation
Parallel arrays and use of unsafe