4

I wasn't totally sure where to go for help with this so I figured I'd try stackoverflow since it usually has answers for about 90% of all my programming related questions.

In a nutshell I've got an open source .NET application that's leaking memory. It might not be a true memory leak in the sense that, when the application closes I suspect the memory is reclaimed, but while it's running it constantly allocates more memory without freeing it. Eventually, a System.OutOfMemoryException is thrown.

To debug the problem I followed the steps recommended in this article, and produced the following graph, where red is the .NET/CLR Memory "#Bytes in all Heaps" and green is the Process "Private Bytes" with windows' Performance Monitor tool (Note the green line has been uniformly scaled down to appear closer to the red line, since only the shapes of the lines matter to me): Performance Monitor Output.

I took the image as evidence of a managed memory leak and followed up with windows' Debug Diagnostic Tool to try and locate the source of the leak (as mentioned in the article). However the report I got back from Debug Diagnostic Tool is very peculiar.

Basically every attempt to collect a "Full UserDump" which I tried to do every 5 seconds while the application was running, was thwarted by the fact that the garbage collector was always in the middle of a garbage collection cycle at the time, causing Debug Diagnostic Tool to output an error, and preventing it from gathering any useful .NET memory related info.

Now I'm stuck, I know I have a managed memory leak, but I don't know how to narrow down where it is. I'm also confused about how the garbage collector was always in the middle of a collection cycle, it makes me wonder if the garbage collection thread was somehow blocked, preventing it from freeing memory and/or exiting the garbage collection cycle.

There are some sections of the Performance Monitor Graph where the allocated .NET memory goes down a bit, so the garbage collector wasn't stuck forever, but it must have been for most of the time, otherwise Debug Diagnostic should have been able to do a UserDump.

A couple of questions:

  1. Is it possible for the garbage collector in a .NET application to get stuck while trying to free some memory, perhaps from some poorly coded destructor/finalizer or something?

  2. What strategy can I use to continue narrowing down the source of the problem?

Ghost314
  • 51
  • 3

1 Answers1

5

It would help if you provide the .NET Framework version, OS version the bitness of your process and number of processors on your machine. If you are getting an Out of memory exception very soon, it would make me think that you are running a 32 bit process but will still help if you can confirm.

First of all you should really check if GC is indeed the cause of the problem by looking for the %Time in GC memory counter for your process. You can monitor this in PERFMON under the .NET CLR Memory object. If this counter is > 30% then for sure you have a GC problem. The PNG that you pasted doesn't tell me how much memory per second is getting allocated so if it is high then GC would kick in more.

One word of caution would be to use the latest Debug Diagnostic tool to analyze the dump and see if you see something different. There is an issue in DebugDiag 1.2 which can cause the message "middle of a garbage collection" to be reported incorrectly so you want to make sure that GC is really running at time of the dump or not.You can download the latest version from http://www.microsoft.com/en-us/download/details.aspx?id=40336 and analyze the same dump file with the tool and see if that reports something different. Also search in your report for GarbageCollectionGeneration string and that will show you the thread that was responsible for invoking the GC. Looking at the stack of that thread you may be able to identify what that thread was doing that ended up invoking a GC. This may or may not help.

Answering your other questions :-

1) GC can get stuck if any thread has PRE-EMPTIVE GC Disabled. You can read the article http://blogs.msdn.com/b/tess/archive/2008/02/11/hang-caused-by-gc-xml-deadlock.aspx more to understand about PRE-EMPTIVE GC and how to figure out threads that have pre-emptive GC disabled.

2) I would open the dump in WinDBG and see follow the approach mentioned in the above blog to see if I reach somewhere. If the framework is .net framework 4.0 or above, you can also use the PERFVIEW tool to collect a trace and that gives good information about why GC's are happening. Check out the video of PERFVIEW to troubleshoot a GC issue at http://channel9.msdn.com/Series/PerfView-Tutorial/PerfView-Tutorial-9-NET-Memory-Investigation-Basics-of-GC-Heap-Snapshots

Also if this EXE is doing some background processing, you may want to enable the server mode of GC and see if that helps. Read the article http://blogs.msdn.com/b/clyon/archive/2004/09/08/226981.aspx for more information on the same.

hope this helps !!!

Puneet Gupta
  • 2,237
  • 13
  • 17