Our multithreaded application does a lengthy computational loop. On average it takes about 29 sec for it to finish one full cycle. During that time, the .NET performance counter % time in GC measures 8.5 %. Its all made of Gen 2 collections.
In order to improve performance, we implemented a pool for our large objects. We archieved a 100% reusement rate. The overall cycle now takes only 20 sec on average. The "% time in GC" shows something between 0.3...0.5%. Now the GC does only Gen 0 collections.
Lets assume, the pooling is efficiently implemented and neglect the additional time it takes to execute. Than we got a performance improvement of roughly 33 percent. How does that relate to the former value for GC of 8.5%?
I have some assumptions, which I hope can be confirmed, adjusted and amended:
1) The "time in GC" (if I read it right) does measure the relation of 2 time spans:
- Time between 2 GC cycles and
- Time used for the last full GC cycle, this value is included into the first span.
What is not included into the second time span, would be the overhead of stopping and restarting the worker threads for the blocking GC. But how could that be as large as 20% of the overall execution time?
2) Frequently blocking the threads for GC may introduce contention between the treads? It is just a thought. I could not confirm that via the VS concurrency profiler.
3) In contrast to that, it could be confirmed that the number of page misses (performance counter: Memory -> Page Faults/sec) is significantly higher for the unpooled application (25.000 per second) than for the application with the low GC rate (200 per second). I could imagine, this would cause the great improvement as well. But what could explain that behaviour? Is it, because frequent allocations are causing a much larger area from the virtual memory address space to be used, which therefore is harder to keep into the physical memory? And how could that be measured to confirm as the reason here?
BTW: GCSettings.IsServerGC = false, .NET 4.0, 64bit, running on Win7, 4GB, Intel i5. (And sorry for the large question.. ;)