15

I just noticed something really strange with regards to garbage collection.

The WeakRef method collects the object as expected while the async method reports that the object is still alive even though we have forced a garbage collection. Any ideas why ?

class Program
{
    static void Main(string[] args)
    {
        WeakRef();
        WeakRefAsync().Wait();
    }

    private static void WeakRef()
    {
        var foo = new Foo();
        WeakReference fooRef = new WeakReference(foo);
        foo = null;
        GC.Collect();
        Debug.Assert(!fooRef.IsAlive);
    }

    private static async Task WeakRefAsync()
    {
        var foo = new Foo();
        WeakReference fooRef = new WeakReference(foo);
        foo = null;
        GC.Collect();
        Debug.Assert(!fooRef.IsAlive);
    }
}


public class Foo
{

}
seesharper
  • 3,249
  • 1
  • 19
  • 23
  • Curiously if you await the garbage collection it does get collected `await System.Threading.Tasks.Task.Run(() => GC.Collect());` – Equalsk Feb 03 '17 at 11:39
  • First thought: maybe because `foo` and `fooRef` in the `async` method become properties of the compiler-generated state machine class. But I tried it with by wrapping these variables in another class and they do get collected....is there any way to ask gc where/who is still referencing the `Foo` instance? – René Vogt Feb 03 '17 at 11:44
  • 1
    IMHO, your method is translated to a class by the compiler and the compiler decides that `foo` is a member of that class. So as long that the object representing your method is alive also its members won't be Garbage Collected – Matteo Umili Feb 03 '17 at 11:45
  • @MatteoUmili i checked the compiler generated code with dotPeek, the generated member gets assigned `null`, too. And as I said I tried it with wrapping the variables in a class and they do get collected. – René Vogt Feb 03 '17 at 11:46
  • 3
    I think it's related to the act of debugging itself. If you change the `Debug.Assert`s to `Console.WriteLine`s and add a `ReadLine` at the end of main, it prints "True False" for a debug build and "True True" for a release build. Maybe something tied into the way that GC lifetimes are extended for local variables when debugging. – Damien_The_Unbeliever Feb 03 '17 at 11:52
  • Here is a [.NET Fiddle](https://dotnetfiddle.net/Dz2YUE), in case anyone wants to tinker... – David Pine Feb 03 '17 at 12:11
  • @RenéVogt in looking at that dotPeek, you're forgetting the locals involved in assigning to the member. – Jon Hanna Feb 03 '17 at 12:24
  • 1
    `foo = null;` is a bug. Understanding [why it is a bug](http://stackoverflow.com/a/17131389/17034) gets you to understand how the C# compiler rewrites an async method and thus get a different outcome. – Hans Passant Feb 03 '17 at 12:28
  • @HansPassant I'd say not having an `await` in the `async` method was the real bug. In real code it's almost never worth doing `someLocal = null` but in an `async` method (or a `yield`-using one) `foo` being a field that lives between calls and so it could indeed have some value. If there were some `await`s in there it wouldn't be a bug so much as an optimisation that was probably premature. – Jon Hanna Feb 03 '17 at 12:43
  • @Equalsk if you await anything that doesn't return a completed task immediately it'll get collected, because the `MoveNext()` method gets called again with new locals. It's the `await` that was significant there, not that the thing awaited was the collection. – Jon Hanna Feb 03 '17 at 14:46
  • @JonHanna Interesting to know, thanks! – Equalsk Feb 03 '17 at 14:47
  • GC is tricky. 1) Local lifetimes are extended when running in the debugger *or* built for `Debug`. 2) `async` (currently) captures all local variables in its state machine, extending their lifetimes. 3) Setting `foo = null` in a sync method doesn't do anything. 4) Local var lifetimes may be extended for a short time even in release code (for efficiency, methods are broken into chunks, not statement-by-statement). 5) The async state machine is only "lifted" when there's a yielding `await`. – Stephen Cleary Feb 03 '17 at 14:48

1 Answers1

9

The WeakRef method collects the object as expected

There's no reason to expect that. Trying in Linqpad, it doesn't happen in a debug build, for example, though other valid compilations of both debug and release builds could have either behaviour.

Between the compiler and the jitter, they are free to optimise out the null-assignment (nothing uses foo after it, after all) in which case the GC could still see the thread as having a reference to the object and not collect it. Conversely, if there was no assignment of foo = null they'd be free to realise that foo isn't used any more and re-use the memory or register that had been holding it to hold fooRef (or indeed for something else entirely) and collect foo.

So, since both with and without the foo = null it's valid for the GC to see foo as either rooted or not rooted, we can reasonably expect either behaviour.

Still, the behaviour seen is a reasonable expectation as to what would probably happen, but that it's not guaranteed is worth pointing out.

Okay, that aside, let's look at what actually happens here.

The state-machine produced by the async method is a struct with fields corresponding to the locals in the source.

So the code:

var foo = new Foo();
WeakReference fooRef = new WeakReference(foo);
foo = null;
GC.Collect();

Is a bit like:

this.foo = new Foo();
this.fooRef = new WeakReference(foo);
this.foo = null;
GC.Collect();

But field accesses always have something going on locally. So in that regard it's almost like:

var temp0 = new Foo();
this.foo = temp0;
var temp1 = new WeakReference(foo);
this.fooRef = temp1;
var temp2 = null;
this.foo = temp2;
GC.Collect();

And temp0 hasn't been nulled, so the GC finds the Foo as rooted.

Two interesting variants of your code are:

var foo = new Foo();
WeakReference fooRef = new WeakReference(foo);
foo = null;
await Task.Delay(0);
GC.Collect();

And:

var foo = new Foo();
WeakReference fooRef = new WeakReference(foo);
foo = null;
await Task.Delay(1);
GC.Collect();

When I ran it (again, reasonable differences in how the memory/registers for locals is dealt with could result in different outcomes) the first has the same behaviour of your example, because while it calls into another Task method and awaits it, that method returns a completed task so the await immediately moves onto the next thing within the same underlying method call, which is the GC.Collect().

The second has the behaviour of seeing the Foo collected, because the await returns at that point and then the state-machine has its MoveNext() method called again roughly a millisecond later. Since it's a new call to the behind-the-scenes method, there's no local reference to the Foo so the GC can indeed collect it.

Incidentally, it's also possible that one day the compiler will not produce fields for those locals that don't live across await boundaries, which would be an optimisation that would still produce correct behaviour. If that was to happen then your two methods would become much more similar in underlying behaviour and hence more likely to be similar in observed behaviour.

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251