Why is garbage collector allowed to collect seemingly referenced objects with a finalizer?

Question

This question is basically why we need GC.KeepAlive() in the first place.

Here's where we need it. We have a wrapper for some unmanaged resource

public class CoolWrapper
{
     public CoolWrapper()
     {
         coolResourceHandle = UnmanagedWinApiCode.CreateCoolResource();
         if (coolResourceHandle == IntPtr.Zero)
         {
             // something went wrong, throw exception
         }
     }

     ~CoolWrapper()
     {
         UnmanagedWinApiCode.DestroyCoolResource(coolResource);
     }

     public void DoSomething()
     {
         var result = UnmanagedWinApiCode.DoSomething(coolResource);
         if (result == 0)
         {
             // something went wrong, throw exception
         }
     }

     private IntPtr coolResourceHandle;
}

and our code uses that wrapper:

var wrapper = CoolWrapper();
wrapper.DoSomething();

and if this code is run in Release configuration and not under debugger then it may so happen that code optimizer sees that the reference is not actually used after this code and also that coolResourceHandle member variable is not accessed (by managed code) after it was read inside DoSomething() and its value was passed into unmanaged code and so the following happens:

DoSomething() is called
coolResourceHandle is read
garbage collection suddenly starts
~CoolWrapper() runs
UnmanagedWinApiCode.DestroyCoolResource() runs and the resource is destroyed, the resource handle is invalidated
UnmanagedWinApiCode.DoSomething() runs using the value which now refers to a non-existing object (or maybe another object is created and assigned that handle)

The situation described above is actually possible and it's a race between a method of the object and a running garbage collection. No matter there's a local variable of reference type on stack - optimized code ignores that reference and the object becomes eligible for garbage collection immediately after coolResourceHandle was read inside DoSomething().

So, to prevent this we use GC.KeepAlive():

var wrapper = CoolWrapper();
wrapper.DoSomething();
GC.KeepAlive(wrapper);

which makes the object non-eligible for GC until GC.KeepAlive() is invoked.

This of course requires that all users use GC.KeepAlive() everywhere which they will forget, so the right place is CoolWrapper.DoSomething():

 public void DoSomething()
 {
     var result = UnmanagedWinApiCode.DoSomething(coolResource);
     GC.KeepAlive(this);
     if (result == 0)
     {
         // something went wrong, throw exception
     }
 }

and this basically prevents the objects from getting eligible for GC while there's a method of this object running.

Why is this needed? Why wouldn't GC ignore the objects which have a method running at that moment and also have a finalizer? This would make life much easier yet we need to use GC.KeepAlive() instead.

Why is such aggressive collection allowed instead of ignoring objects which have methods currently running and a finalizer (and so likely to have problems in case there's a race as described above)?

well you picked it, the GC goes by reference count - theory is doing something or not if nobodies looking at it, why keep it? — Rob, Dec 05 '17 at 15:04
@Rob The GC does not go by reference count. That statement is false. — InBetween, Dec 05 '17 at 15:05
@InBetween - sorry, confused with another, it just by reference or no (not count thereof) - to late to be corrected - sorry. — Rob, Dec 05 '17 at 15:08
You use `coolResource` and `coolResourceHandle`, once of both is a typo, isn't it? — Tim Schmelter, Dec 05 '17 at 15:10
There is even code analysis warning for this exact case: https://learn.microsoft.com/en-us/visualstudio/code-quality/ca2115-call-gc-keepalive-when-using-native-resources. So this situation is expected and using `GC.KeepAlive` is "official" method to go here. — Evk, Dec 05 '17 at 15:29
If `CoolWrapper` manages an unmanaged resource, it should implement `IDisposable`. If it did that, `Dispose()` should be where the resource is destroyed. Since `Dispose` would pass `coolResourceHandle` into an unmanaged call, any code path that could still call `Dispose` in the future would keep the reference from being GC'd, so there would be no need for `GC.KeepAlive`. — Daniel Pryden, Dec 05 '17 at 15:32
The real problem here is your use of a finalizer, and worse still, your *inappropriate* use of a finalizer, not the behavior of the GC. If you use the appropriate mechanism for disposing of unmanaged resources, as mentioned by Daniel above, then things work smoothly. Honestly, there's probably never a reason for you to define a finalizer in any of your code *ever*; there are better tools for cleaning up unmanaged resources that don't have problems like these (and others) that finalizers have. — Servy, Dec 05 '17 at 15:42
@Servy So I should implement `IDisposable` and not have a finalizer? Where can I read more about this? — sharptooth, Dec 05 '17 at 15:50
You can read about implementing IDisposable here: https://stackoverflow.com/questions/538060/proper-use-of-the-idisposable-interface ... I'd also recommend reading Eric Lippert's answer regarding destructors (as well as the blog post linked therein) here: https://stackoverflow.com/a/4899622/6157210 ... When in doubt, read the documentation. — Trioj, Dec 05 '17 at 16:24

score 5 · Accepted Answer · answered Dec 05 '17 at 15:12

Why is this needed? Why wouldn't GC ignore the objects which have a method running at that moment and also have a finalizer?

Because that's not what the GC (or the C# specification) guarantees. The guarantee is that if an object won't be finalized or collected while it's still possible to read a field from it. If the JIT/GC detects that although you're currently executing an instance method, there's no execution path whereby that method will read any more fields, it is legal for the object to be collected (assuming there's nothing else keeping it alive).

It's surprising, but that's the rule - and I strongly suspect that the reason for it is to allow optimization paths that would otherwise be impossible.

Your fix of using GC.KeepAlive is a perfectly reasonable one. Note that the number of situations where this is relevant is pretty tiny.

…and it’s worth mentioning that the optimizer may even reduce the field accesses, allowing an even earlier collection. The garbage collector’s task is identifying reusable memory and it’s good in its job. The actual issue is the idea to settle non-memory resource management atop something that only cares for memory, a design mistake when Java was created and why C# did not only repeat this mistake, but added the misleading C++ destructor syntax, will never get me. — Holger, Dec 07 '17 at 09:04

score 2 · Answer 2 · answered Dec 05 '17 at 15:47

Finalizers don't "collect" anything. Instead, they prevent objects from being collected and notify objects that they would have been collected but for the existence of an active finalizers. Note that if object X holds a reference to Y, Y will be uncollectable if either X or Y has an active finalizer. Y's finalizer (if it exists) will have no way of knowing whether it's the only thing keeping Y alive, or whether other finalizers may exist that would also keep Y alive.

A fundamental principle is that objects exist as long as any reference to them exists anywhere; as soon as the last reference to an object ceases to exist, the object will as well. The GC does not destroy objects; instead, it reclaims memory that was formerly used by objects that have ceased to exist. If an object has an active finalizer, a reference to it will be kept in a special list of objects that have active finalizer; as long as that reference exists, the object will do so as well. When a GC is performed, the system marks all the objects that would exist even in the absence of that list, and once that's done it produces a queue of objects that are on that list but haven't been marked. After that, it will start calling finalizers of objects on that queue.

score 1 · Answer 3 · answered Dec 05 '17 at 15:25

Consider any method that creates garbage and then spends a long time doing other things before exiting. The obvious example is the main method of any executable which may perform any number of initialization actions before entering some form of loop (such as a windows message loop) that won't exit for the entire lifetime of the process.

We want to be able to clean up that garbage. But that means we have to allow the GC to not treat methods as opaque - it has to be able to inspect a running method and know what is still in use right at this moment and only protect those items from being collected.

This is why the GC is "aggressive" and why object collection can happen at any time - even whilst the constructor is still running (assuming it will not access any instance members from the current point of its execution forwards).

Why is garbage collector allowed to collect seemingly referenced objects with a finalizer?

3 Answers3