5

I am wondering why after creating a very simple DataTable and then setting it to null does the Garbage Collection not clear out all the memory used by that DataTable. Here is an example. The variable Before should be equal to Removed but it is not.

{
 long Before = 0, After = 0, Removed = 0, Collected = 0;

 Before = GC.GetTotalMemory(true);
 DataTable dt = GetSomeDataTableFromSql();
 After = GC.GetTotalMemory(true);
 dt = null;
 Removed = GC.GetTotalMemory(true);
 GC.Collect();
 Collected = GC.GetTotalMemory(true);
}

Gives the following results.

Before = 388116
After = 731248
Removed = 530176
Collected = 530176
Kara
  • 6,115
  • 16
  • 50
  • 57
gmac
  • 51
  • 1
  • 2

6 Answers6

5

Several reasons:

GC runs in its own sweet time; usually when the runtime is short on memory. This is why disposing objects like DB connections is important; yes they'll be released eventually, but not until GC deigns to run.

GC.Collect() does not run the GC thread directly; it schedues a run of GC. Again, the runtime normally only runs GC when it notices the sandbox is getting cluttered, or if there is significant idle time. GC.Collect() is an override that behaves the same as if one of these automatic triggers had happened. it is not an inline call to run the garbage collection algorithm; that would result in noticeable performance degradation.

GC runs in its own thread. Therefore, information provided by the GC static methods are based on what is available to the caller at the time it's called. You are calling GetTotalMemory for the last time while the GC is still working, or maybe before it even starts, and so the memory figures haven't been updated with things the GC is finalizing.

In summary, GC is designed to be largely hands-off. GC.Collect() is equivalent to hanging the "please service" sign on your hotel door; it's a suggestion that maybe now would be a good time to clean up.

KeithS
  • 70,210
  • 21
  • 112
  • 164
4

GC.Collect(); is merely a suggestion to the Garbage Collector that there may be objects that need to be cleaned up. The GC runs on its own schedule and it's very rare that it will need the GC.Collect(); prompting.

The chances of seeing an impact on memory by calling GC.Collect(); immediately (microseconds) after you've released a resource are slim.

Also: The DataTable object isn't special in the eyes of the GC. Any reference type in .NET will be treated by the GC in the same way.

Paul Sasik
  • 79,492
  • 20
  • 149
  • 189
  • +1. that's a nice, short way to answer the fact that the runtime offers no guarantee on collection. Nicely worded. – David Jan 12 '11 at 19:47
  • @David: Thanks. I think the question and code sample is also a great example of how the GC doesn't jump exactly when you tell it to! – Paul Sasik Jan 12 '11 at 19:50
  • 1
    The documentation for GC.Collect says, "Forces an immediate garbage collection of all generations." http://msdn.microsoft.com/en-us/library/xe0c2357.aspx – Lee Jan 12 '11 at 20:23
  • 1
    @Lee: Interesting, I checked the link and... it just isn't accurate. I have seen many other MSDN resources that counter that statement and if you check the responses in this thread (like KeithS's & David Stratton's) they counter it as well. You have to take MSDN documentation with a grain of salt. It isn't necessarily written by the people that wrote the APIs. – Paul Sasik Jan 12 '11 at 22:38
  • 1
    @Paul: I cannot find MSDN resources that counter this statement. KeithS's answer does not cite sources, though with a rep of 27K and yours with 29K, I give your opinions a ton of weight. David Stratton's sources all speak to garbage collection being indeterminate, but do not say that calls to GC.Collect are indeterminate. My personal guess is that GC.Collect would have to wait for a safe point on all threads and then collect. Is there documentation that calling GC.Collect is really just a "suggestion" that can be postponed, or even ignored? – Mike Rosenblum Jan 04 '13 at 19:16
  • @MikeRosenblum: I did a few searches and looked at over a dozen pages but could not find anything conclusive. I honestly think that I got the info that I'm sharing in the answer from Jeffrey Richter's CLR via C# 2nd ed. but am not 100% positive. If you really need to dig up an official reference I would suggest posting a new question (referring to this one) and perhaps offering a bounty. – Paul Sasik Jan 06 '13 at 06:26
3

The documentation for GC.GetTotalMemory states:

The garbage collector does not guarantee that all inaccessible memory is collected.

It suggests that it will only block for a short interval to wait for garbage collection and finalisers to complete. this SO answer explains that DataTables do not hold any managed resources and suppress finalisation, so you should not need to call GC.WaitForPendingFinalizers for memory to be reclaimed.

Another possibility is that dt is not eligible for collection when GC.Collect is called - if there is a class-member or parent DataSet holding a reference to it, then it cannot be collected.

In addition, contrary to some of the other answers, GC.Collect does force an immediate collection (not just a 'hint') - the documentation states:

Forces an immediate garbage collection of all generations.

This article also says this is the case - in the 'Conditions for a garbage collection' section, one of the three possibilities is:

The GC.Collect method is called. In almost all cases, you do not have to call this method, because the garbage collector runs continuously. This method is primarily used for unique situations and testing.

Community
  • 1
  • 1
Lee
  • 142,018
  • 20
  • 234
  • 287
1

The documentation on garbage collection on .NET has ALWAYS stated that it makes no guarantees as to when collection ocurs.

http://msdn.microsoft.com/en-us/magazine/bb985010.aspx

http://msdn.microsoft.com/en-us/library/ee787088.aspx

http://www.simple-talk.com/dotnet/.net-framework/understanding-garbage-collection-in-.net/ - - This is a nice article for explaining garbage collection with nice diagrams to make it easier to grasp.

excerpt from that last article relevant to your question:

If an object has a finalizer, it is not immediately removed when the garbage collector decides it is no longer ‘live’. Instead, it becomes a special kind of root until .NET has called the finalizer method. This means that these objects usually require more than one garbage collection to be removed from memory, as they will survive the first time they are found to be unused.

David
  • 72,686
  • 18
  • 132
  • 173
  • Good point on finalizers. Finalizers could in theory be very resource intensive if immediately invoked for all objects that were marked for garbage collection. Imagine if GC.Collect() were a synchronous process that iterated through all collectible objects and invoked their finalizers! – Shan Plourde Jan 12 '11 at 20:10
1

According to http://msdn.microsoft.com/en-us/library/xe0c2357.aspx:

Use this method to try to reclaim all memory that is inaccessible.

All objects, regardless of how long they have been in memory, are considered for collection; however, objects that are referenced in managed code are not collected. Use this method to force the system to try to reclaim the maximum amount of available memory.

Thus it may be possible that calling Collect() will not necessarily produce what you're expecting it to immediately produce.

Shan Plourde
  • 8,528
  • 2
  • 29
  • 42
0

Along with the previous answers about how garbage collection works the first access of the DataTable class allocates some static variables that are not going to be released when no instances of DataTable's exist. This can also apply to other classes depending on their implementation. Creating a DataTable and all the

Since you are accessing SQL, according to your GetSomeDataTableFromSql() method, you may have some cached SQL connection instances and other objects you don't directly know about.

So allocating a DataTable instance and then getting rid of it will not get you back to the same level of memory allocation you had.

Brian Walker
  • 8,658
  • 2
  • 33
  • 35