We have run into similar problems to @Grzenio however we are working with much larger 2-dimensional arrays, in the order of 1000x1000 to 3000x3000, this is in a webservice.
Adding more memory isn't always the right answer, you have to understand your code and the use case. Without GC collecting we require 16-32gb of memory (depending on customer size). Without it we would require 32-64gb of memory and even then there are no guarantees the system won't suffer. The .NET garbage collector is not perfect.
Our webservice has an in-memory cache in the order of 5-50 million string (~80-140 characters per key/value pair depending on configuration), in addition with each client request we would construct 2 matrices one of double, one of boolean which were then passed to another service to do the work. For a 1000x1000 "matrix" (2-dimensional array) this is ~25mb, per request. The boolean would say which elements we need (based on our cache). Each cache entry represents one "cell" in the "matrix".
The cache performance dramatically degrades when the server has > 80% memory utilization due to paging.
What we found is that unless we explicitly GC the .net garbage collector would never 'cleanup' the transitory variables until we were in the 90-95% range by which point the cache performance had drastically degraded.
Since the down-stream process often took a long duration (3-900 seconds) the performance hit of a GC collection was neglible (3-10 seconds per collect). We initiated this collect after we had already returned the response to the client.
Ultimately we made the GC parameters configurable, also with .net 4.6 there are further options. Here is the .net 4.5 code we used.
if (sinceLastGC.Minutes > Service.g_GCMinutes)
{
Service.g_LastGCTime = DateTime.Now;
var sw = Stopwatch.StartNew();
long memBefore = System.GC.GetTotalMemory(false);
context.Response.Flush();
context.ApplicationInstance.CompleteRequest();
System.GC.Collect( Service.g_GCGeneration, Service.g_GCForced ? System.GCCollectionMode.Forced : System.GCCollectionMode.Optimized);
System.GC.WaitForPendingFinalizers();
long memAfter = System.GC.GetTotalMemory(true);
var elapsed = sw.ElapsedMilliseconds;
Log.Info(string.Format("GC starts with {0} bytes, ends with {1} bytes, GC time {2} (ms)", memBefore, memAfter, elapsed));
}
After rewriting for use with .net 4.6 we split the garbage colleciton into 2 steps - a simple collect and a compacting collect.
public static RunGC(GCParameters param = null)
{
lock (GCLock)
{
var theParams = param ?? GCParams;
var sw = Stopwatch.StartNew();
var timestamp = DateTime.Now;
long memBefore = GC.GetTotalMemory(false);
GC.Collect(theParams.Generation, theParams.Mode, theParams.Blocking, theParams.Compacting);
GC.WaitForPendingFinalizers();
//GC.Collect(); // may need to collect dead objects created by the finalizers
var elapsed = sw.ElapsedMilliseconds;
long memAfter = GC.GetTotalMemory(true);
Log.Info($"GC starts with {memBefore} bytes, ends with {memAfter} bytes, GC time {elapsed} (ms)");
}
}
// https://msdn.microsoft.com/en-us/library/system.runtime.gcsettings.largeobjectheapcompactionmode.aspx
public static RunCompactingGC()
{
lock (CompactingGCLock)
{
var sw = Stopwatch.StartNew();
var timestamp = DateTime.Now;
long memBefore = GC.GetTotalMemory(false);
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();
var elapsed = sw.ElapsedMilliseconds;
long memAfter = GC.GetTotalMemory(true);
Log.Info($"Compacting GC starts with {memBefore} bytes, ends with {memAfter} bytes, GC time {elapsed} (ms)");
}
}
Hope this helps someone else as we spent a lot of time researching this.
[Edit] Following up on this, we have found some additional problems with the large matrices. we have started encountering heavy memory pressure and the application suddenly being unable to allocate the arrays, even if the process/server has plenty of memory (24gb free). Upon deeper investigation we discovered that the process had standby memory that was almost 100% of the "in use memory" (24gb in use, 24gb standby, 1gb free). When the "free" memory hit 0 the application would pause for 10+ seconds while standby was reallocated as free and then it could start responding to requests.
Based on our research this appears to be due to fragmentation of the large object heap.
To address this concern we are taking 2 approaches:
- We are going to change to jagged array vs multi-dimensional arrays. This will reduce the amount of continuous memory required, and ideally keep more of these arrays out of the Large Object Heap.
- We are going to implement the arrays using the ArrayPool class.
[Edit2] Implementing the ArrayPool class required some rework of the existing code because the Array that is "rented" is not the same size as the requested e.g. if you request an array of size 12, you could get a size 20 array. So things like the "Length" property can be unpredictable. To address this we actually wrapped the ArrayPool in another class which guaranteed that you could only access the cells in the array of the length requested and was consistent in its returning of count/length properties. Doing this avoided significant redesign of the application code.
[Edit3] We later determined that the largest hit in our RunGC method is the WaitForPendingFinalizers, if you look at the fine code for this it is actually doing a blocking collect so even if you don't "force" the collect in the first GC.Collect() call, this will cause a blocking collect. Furthermore the finalizer process is single threaded so we discovered that we needed to remove this statement entirely. After doing this we found our blocking and collection time dramatically reduced. This solution may not work for everyone, especially if using COM objects, but it worked for us. If using COM objects, there are alternatives but it requires very advanced coding practices and all COM objects have to be manually released and you have to be very careful when accessing properties of COM objects to avoid "hidden" instances of COM objects.