Will GC clean up objects referenced by non-live but in-scope references?

Question

I'm wondering if I can depend on the .NET garbage collector to avoid keeping a bunch of extra heap objects around in this type of scenario:

public static void Main(string[] args) {
    var a = ParseFileAndProduceABigTreeObject(args[0]);
    var b = WalkTheBigTreeObjectAndProduceSomeOtherBigObject(a);
    var c = AThirdRoundOfProcessing(b);
    Console.WriteLine(c.ToString());
}

In each phase here, it's to be understood that the objects returned by each method hold no references to the previous objects, so b doesn't reference a, and c doesn't reference a or b.

A naive implementation of GC would keep a around for the entire duration of the program because it continues to be reachable via the Main method's stack frame.

What I'm wondering is if I can depend on .NET to do liveness analysis and determine that by the time the third line (AThirdRoundOfProcessing) is executed, a is no longer needed and its memory can be reclaimed if necessary?

I'm almost certain that .NET handles cases like this at least sometimes, so my question is really this: is it consistent enough for me to rely on it, or should I just assume that it might not and take measures to make the code more foolproof? (Here, for example, I could set a to null.)

P.S.: What about OpenJDK, would it handle it nicely?

Edit: I may not have been clear. I'm aware that, in terms of standards, the runtime is allowed but not required to collect a. Were I running my code on an imaginary runtime where all I knew was that it's conformant, I'd have to live with that uncertainty. However, my code is running on a recent version of the Microsoft .NET 4 runtime. What I'd like to know is whether that runtime can be expected to do this or not. The runtime is actual code and it's possible to know exactly what it would do. Maybe someone out there has that knowledge and would like to share it.

If you're worried about it keeping the reference alive until it goes out of scope, couldn't you just call it as `AThirdRoundOfProcessing(WalkTheBigTreeObjectAndProduceSomeOtherBigObject(ParseFileAndProduceABigTreeObject(args[0])));`? — Kateract, Apr 08 '16 at 20:02
Yes they will be eligible for collection immediately. The GC will even collect objects while their instance methods are running if its `this` reference is no longer required. — Lee, Apr 08 '16 at 20:05
@Kateract: That's just a different way of helping out the runtime, and a pretty ugly one IMO. I gave a toy example, but imagine this is 10 or 20 method calls. — Nate C-K, Apr 08 '16 at 20:06
You could always create 3 `WeakRefrence` objects and test to see if they get collected after a forced collection. In release mode, without a debugger attached, you should see the refrences get collected after a forced collection. — Scott Chamberlain, Apr 08 '16 at 20:16
If the spec says it is undefined, why would you want to rely on how a particular implementation behaves? What if a later version doesn't do it the same way? — n8wrl, Apr 08 '16 at 20:23
Per your edit, if all you care about is a specific version of the runtime why don't you test that specific version and find out? — Scott Chamberlain, Apr 08 '16 at 20:25
@ScottChamberlain: If I undertook to find the answers to all the questions I have on my own, I would never ask anything on Stack Overflow. This is something I'd like to know, but it's not something I need to know. — Nate C-K, Apr 08 '16 at 20:28
Perhaps "rely" implies an overstatement of my desire to have this happen. If I had anything important riding on the answer to this question, I would 1) structure my code so as to be more certain of its behavior, and 2) not trust random people on Stack Overflow to give me the answer. — Nate C-K, Apr 08 '16 at 20:38
Covered pretty well already in [this Q+A](http://stackoverflow.com/questions/17130382/understanding-garbage-collection-in-net). — Hans Passant, Apr 08 '16 at 22:02

score 4 · Answer 1 · edited Apr 08 '16 at 20:14

4

You can never rely on the GC to clean up anything ever. It's never required to clean up objects as soon as they're eligible for collection. The entire IDisposable pattern exists precisely because there is no deterministic way of having the GC clean up resources. The power of the GC is that it doesn't have to clean up resources as soon as their lifetime ends. It's able to do its job much more effectively by being given the freedom to clean up eligible resources whenever it wants to, without having virtually any requirements about when it needs to have cleaned up a given resource.

The object is eligible for collection as soon as the runtime can prove that the object can never be accessed again from code executing in the future, so in your case, based on your description, those objects are eligible for collection, but you can have no expectation whatsoever that they will actually be collected at any point before the entire process gets torn down.

edited Apr 08 '16 at 20:14

n8wrl

19,439
4
63
103

answered Apr 08 '16 at 20:07

Servy

202,030
26
332
449

I appreciate what you're saying but it doesn't tell me anything I didn't already know. I don't really want to know for sure if an object will be collected, as having it collected does nothing for me, per se. I want to know if the current implementation of .NET will collect `a` or run out of memory if line 3 allocates a whole bunch of objects and it runs out of available space. – Nate C-K Apr 08 '16 at 20:09
I'm also aware that the standard allows this collection to happen. What I don't know is whether the implementation takes advantage of this allowance in the standard. – Nate C-K Apr 08 '16 at 20:11
@NateC-K I don't know if he updated the answer since you wrote that but the 2nd paragraph does answer your exact question. *"The object is eligible for collection as soon as the runtime can prove that the object can never be accessed again from code executing in the future, so in your case, based on your description, those objects are eligible for collection"* – Scott Chamberlain Apr 08 '16 at 20:11
3

@NateC-K If you knew the answer to your question before you asked it, then you shouldn't really be surprised if the answer you get was something you already knew. The GC is *allowed* to clean up the object, but it is not *required* to do so. – Servy Apr 08 '16 at 20:11
I'm not asking about allowing or requiring. I'm asking about the real-world implementation of the .NET runtime. – Nate C-K Apr 08 '16 at 20:15
@NateC-K So you want to know if you can rely on undefined behavior that .NET is explicitly designed for you not to rely on? Does that question not answer itself? – Servy Apr 08 '16 at 20:20
It's OK to just say you don't know the answer. – Nate C-K Apr 08 '16 at 20:21
3

@NateC-K I gave you the answer. You asked if you can *depend* on it to be collected. You can't. You repeating the question just because you don't like the answer isn't going to change the answer. – Servy Apr 08 '16 at 20:22

score 2 · Accepted Answer · answered Apr 08 '16 at 20:41

You seem to only be interested in testing a specific version of .NET. Here is a quick example program that could test what the runtime will do for your specific code in the specific configuration you are running it in.

static void Main(string[] args)
{

    var a = ParseFileAndProduceABigTreeObject(args[0]);
    var aWeakReference = new WeakReference(a);

    GC.Collect();
    GC.WaitForPendingFinalizers();
    GC.Collect();
    Console.WriteLine($"a: {aWeakReference.IsAlive}");

    var b = WalkTheBigTreeObjectAndProduceSomeOtherBigObject(a);
    var bWeakReference = new WeakReference(b);

    GC.Collect();
    GC.WaitForPendingFinalizers();
    GC.Collect();
    Console.WriteLine($"a: {aWeakReference.IsAlive} b: {bWeakReference.IsAlive}");

    var c = AThirdRoundOfProcessing(b);
    var cWeakReference = new WeakReference(c);

    GC.Collect();
    GC.WaitForPendingFinalizers();
    GC.Collect();
    Console.WriteLine($"a: {aWeakReference.IsAlive} b: {bWeakReference.IsAlive} c:{cWeakReference.IsAlive}");

    Console.WriteLine(c.ToString());

    GC.Collect();
    GC.WaitForPendingFinalizers();
    GC.Collect();
    Console.WriteLine($"a: {aWeakReference.IsAlive} b: {bWeakReference.IsAlive} c:{cWeakReference.IsAlive}" );

    Console.ReadLine();
}

In debug mode, with or without a debugger you get

a: True
a: True b: True
a: True b: True c:True
This is some processed Result!
a: True b: True c:True

In release mode, in 4.5.2, with or without the debugger attached you get

a: False
a: False b: True
a: False b: False c:True
This is some processed Result!
a: False b: False c:False

I would not trust the release with debugger attached results though, I really expected the same results as debug build, I may just have my settings wonky.

Thanks! I hadn't planned on running any tests, but I can't turn up my nose at your helpfulness, so I ran this (well, a variant) on my machine. The release build has released all of the memory by the end of Main. The debug build never releases any of the object. — Nate C-K, Apr 08 '16 at 21:31
It would be more interesting if I found the sweet spot where the debug build ran out of memory and the release build didn't, but the tests I've run so far run out of memory for either both, or neither. — Nate C-K, Apr 08 '16 at 21:33

usr · Answer 3 · 2016-04-08T21:27:37.087

The other answers and comments have already explained that there is no guarantee of collection. You seem to understand that.

In practice this will work. The JIT tracks local value lifetimes and this is a rather easy tracking problem.

For backwards compatibility reasons it is hard for the JIT to track less precisely because this might cause explosion of memory usage for a few apps. So over time time tracking is unlikely to lose precision is common cases.

I believe framework code is relying on this as well. I have seen libraries rely on it and I relied on it myself.

Clearly, this is not a reference answer but it's "common knowledge" that this will work.

Note, that in Debug mode local variable lifetimes are extended to the end of the method call to aid in debugging. So this requires an optimized JIT operation.

If you want to make more sure that collection will happen you can try to split methods off. Separating off stack frames is more reliable but still not guaranteed. Another similar idea is to put locals into an object[] and explicitly null slots in that array when you are done with that object. Again, no guarantee.

You mentioned a = null; as another strategy. This would possibly help in debug mode. In optimized mode the JIT would kill that assignment to a dead variable. This would only help in pathological cases that border on being JIT bugs. Not a good strategy.

There is no guarantee of any collection because the null garbage collection satisfies all guarantees that the runtime makes.

While it is true that no formal guarantees are made in this case programmers are relying on implicit guarantees all the time. For example, many apps rely on Enumerable.Select not reordering elements. This is not guaranteed in the documentation yet most profession programmers would feel comfortable relying on this behavior.

It is not a useful attitude to only rely on formally guaranteed behavior in all cases. Except when programming the Mars Rover or the Therac 25 (a medial device irradiating patients to death).

Will GC clean up objects referenced by non-live but in-scope references?

3 Answers3