24

Reading this old but classic document Writing High-Performance Managed Applications - A Primer, I came across following statment

The GC is self-tuning and will adjust itself according to applications memory requirements. In most cases programmatically invoking a GC will hinder that tuning. "Helping" the GC by calling GC.Collect will more than likely not improve your applications performance

I am working with applications that during a given point in time, consumes a lot of memory. When I am done in code consuming that memory, I am calling GC.Collect. If I don't do it I get Out of memory exception. This behaviour is inconistent but roughtly speaking 30% of the time, I get an out of memory. After adding GC.Collect I never got this out of memory exception. Is my action justified even though this best practice document is advising against it?

palm snow
  • 2,392
  • 4
  • 29
  • 49

4 Answers4

30

Part of what goes on in the GC is that objects in memory are generational, such that early generations are collected more frequently than others. This helps save performance by not trying to collect long-lived objects all the time.

With that in mind, two things can happen when you call GC.Collect() yourself. The first is that you end up spending more time doing collections. This is because the normal background collections will still happen in addition to your manual GC.Collect(). The second is that you'll hang on to the memory longer, because you forced some things into a higher order generation that didn't need to go there. In other words, using GC.Collect() yourself is almost always a bad idea.

There are a few cases where the garbage collector doesn't always perform well. One of these is the large object heap. This is a special generation for objects larger than a certain size (80,000 bytes, IIRC, but that could be old now). This generation is hardly ever collected and almost never compacted. That means that over time you can end up with many sizable holes in memory that will not be released. The physical memory is not actually used and is available for other processes, but it does still consume address space within your process, of which you are limited to 2GB by default.

This is a very common source for OutOfMemory exceptions — you're not actually using that much memory, but you have all this address space taken up by holes in the large object heap. By far the most common way this happens is repeatedly appending to large strings or documents. This probably is not you, because in this scenario no amount of calls to GC.Collect() will companct the LOH, but in your case it does seem to help. However, this is the source for the vast majority of the OutOfMemory exceptions I've seen.

Another place where the garbage collector does not always perform well is when certain things cause objects to remain rooted. One example is that event handlers can prevent an object from being collected. A way around this is make sure that every += operation to subscribe an event has a corresponding -= operation to unsubscribe it. But again, a GC.Collect() is unlikely to help here - the object is still rooted somewhere, and so can't be collected.

Hopefully this gives you an avenue of investigation to solve your underlying problem that causes the need to use GC.Collect() in the first place. But if not it is, of course, better to have a working program than a failing program. Anywhere I do use GC.Collect(), I would make sure the code is well documented with the reason why you need it (you get exceptions without) and the exact steps and data required to reproduce it reliably so that future programmers who may want to remove this can know for sure when it is safe to do so.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • 3
    Actually calling GC.Collect() may not be too bad (as I initially thought) especially reading this article from Jeffery Richter http://msdn.microsoft.com/en-us/magazine/bb985011.aspx "since your application knows more about its behavior than the runtime does, you could help matters by explicitly forcing some collections". Although I am still curious why GC.Collect help solve OOM when the object are on LOH, where GC.Collect has no control. – palm snow Mar 09 '11 at 17:44
  • @palmsnow, I was also curious about it. After searching a lot I have found the answer here: http://stackoverflow.com/questions/10016541/garbage-collection-not-happening-even-when-needed – SiberianGuy Feb 08 '13 at 11:22
  • 1
    @palmsnow That's a half-decent argument in a single-threaded application, but it breaks down completely if you do any singificant asynchronous or multi-threaded code. I don't even remember the last time I was working on a truly single-threaded application. Remember, `GC.Collect` affects (and freezes) the whole process. – Luaan Sep 22 '15 at 12:35
7

Most people would say that making your code work correctly is more important than making it fast. Thus, it it fails to work 30% of the time when you don't call GC.Collect(), then that trumps all other concerns.

Of course, that leads to the deeper question of "why do you get OOM errors? Is there a deeper issue that should be fixed, instead of just calling GC.Collect().

But the advice you found talks about performance. Do you care about performance if it makes your app fail 30% of the time?

jalf
  • 243,077
  • 51
  • 345
  • 550
  • @jalf, This application basically compares one image against a library of images cached into memory. Depending upon size of images and cache size (configured by application users), we may run into siutation causing OOM. I obviously needed application availability more then performance at that point (that's why I added GC.Collect()), however trying to determine the fine line between scalability and performance vs availability. – palm snow Mar 09 '11 at 16:36
  • @palm - I'll bet your images are sitting on the Large Object Heap. You load and unload a bunch of images and over time the LOH will fragment. If you can do the comparison in chunks using segments smaller than 80000 bytes, the LOH will never be involved and your problem will go away. Depending on your comparison, this will likely also be **much faster**, as it may mean you don't need to evaluate the entire image for every image in the library each time. – Joel Coehoorn Mar 09 '11 at 16:58
  • @Joel, You are right; each of image is atleast 1MB in size and we can have well over 50,000 images in the cache. – palm snow Mar 09 '11 at 17:07
  • 1
    @palm snow: Are you performing some sort of fuzzy comparison of the images or an exact pixel-by-pixel match? If you're doing exact matches then you should consider generating/caching/matching hashes rather than the bitmaps themselves - it'll be a lot faster and a lot less memory-hungry. – LukeH Mar 09 '11 at 17:13
  • @LukeH: Missed a minor detail:) We are using a third party library that does the comparison. Not sure about their alogirthm. But given that these images sit on LOH, call to GC.Collect() will not help addressing this fragmentation issue. Any thoughts on why adding call to GC.Collect() can make application stable in this scenario? – palm snow Mar 09 '11 at 17:26
  • @palm snow: The GC still *collects* objects in the LOH (whenever there's a gen-2 collection, iirc). The GC doesn't *compact* the LOH; it does mark freed segments so that they can be re-used though. – LukeH Mar 09 '11 at 18:10
  • @LukeH GC collecting object from LOH is not going to compact it and therefore the size still remain the same. So I am still puzzled as to why GC.Collect will help deal with OOM in this cse. – palm snow Mar 10 '11 at 18:27
2

Generally speaking, GC.Collect shouldn't be necessary. If your images exist in unmanaged memory, then be sure to use GC.AddMemoryPressure and GC.RemoveMemoryPressure appropriately.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
0

From your description it sounds like Disposeable objects are not being disposed, or you're not setting member values that will be replaced to null before the operation. As an example of the latter:

  • Get table, display in grid
  • (User hits refresh)
  • Form is disabled while data is refreshed
  • Query comes back, new data is populated in the grid

You can clear out the grid in the interim since it's about to be replaced anyway; you will temporarily have both tables in memory (unnecessarily) while it is replaced if you don't.

Mark Sowul
  • 10,244
  • 1
  • 45
  • 51