1

My IAsyncEnumerable<T> function must run its cleanup code regardless of whether the enumerator it returns is disposed of correctly. In the example below, I've used the task of returning a byte array to the array pool as an example of mandatory cleanup code. Each test ceases use of the enumerator before it has completely finished. All but the last of the tests correctly dispose the enumerator via the foreach utility, but the last test deliberately does not dispose of the enumerator. Instead, it allows the enumerator to pass out of scope, and then triggers garbage collection in an attempt to see if the system itself can trigger the last of the cleanup code.

If the cleanup code runs correctly for each test scenario, the expected output would be:

Starting 'Cancellation token'.
Finalizing 'Cancellation token'.
Starting 'Exception'.
Finalizing 'Exception'.
Starting 'Break'.
Finalizing 'Break'.
Starting 'Forget to dispose'.
Finalizing 'Forget to dispose'.

However, I have never been able to create a test in which the final expected output line appears.

Here is the test code:

using System;
using System.Buffers;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Threading;
using System.Threading.Tasks;

// -- Cancellation token test --
var cts = new CancellationTokenSource();
await foreach (var (index, bytes) in GetDataPackets("Cancellation token", cts.Token)) {
    if (index == 2)
        cts.Cancel();
}

// -- Thrown exception test --
try {
    await foreach (var (index, bytes) in GetDataPackets("Exception")) {
        if (index == 2)
            throw new Exception("Boom");
    }
} catch { }

// -- With Break test --
await foreach (var (index, bytes) in GetDataPackets("Break")) {
    if (index == 2)
        break;
}

// -- Forget to dispose test --
// Create variables and forget them in another "no inlining" method
// to make sure they have gone out of scope and are available for garbage collection.
await ForgetToDispose();
GC.Collect();
GC.WaitForPendingFinalizers();

[MethodImpl(MethodImplOptions.NoInlining)]
async Task ForgetToDispose() {
    var enumerable = GetDataPackets("Forget to dispose");
    var enumerator = enumerable.GetAsyncEnumerator();
    await enumerator.MoveNextAsync();
    while (enumerator.Current.Index != 2)
        await enumerator.MoveNextAsync();
}

static async IAsyncEnumerable<(int Index, Memory<byte> Bytes)> GetDataPackets(string jobName, [EnumeratorCancellation] CancellationToken cancellationToken = default) {
    Console.WriteLine($"Starting '{jobName}'.");
    var rand = new Random();
    var buffer = ArrayPool<byte>.Shared.Rent(512);
    try {
        for (var i = 0; i < 10; i++) {
            try {
                await Task.Delay(10, cancellationToken);
            } catch (OperationCanceledException) {
                yield break;
            }
            rand.NextBytes(buffer);
            yield return (i, new Memory<byte>(buffer));
        }
    } finally {
        Console.WriteLine($"Finalizing '{jobName}'.");
        ArrayPool<byte>.Shared.Return(buffer);
    }
}

Persisting, I then tried a few ideas that might help. None of them have:

Idea 1: Add a using statement and a disposable struct that performs the cleanup work: Fail.

static async IAsyncEnumerable<(int Index, Memory<byte> Bytes)> GetDataPackets(string jobName, [EnumeratorCancellation] CancellationToken cancellationToken = default) {
    Console.WriteLine($"Starting '{jobName}'.");
    var rand = new Random();
    var buffer = ArrayPool<byte>.Shared.Rent(512);
    using var disposer = new Disposer(() => {
        Console.WriteLine($"Finalizing '{jobName}'.");
        ArrayPool<byte>.Shared.Return(buffer);
    });
    for (var i = 0; i < 10; i++) {
        try {
            await Task.Delay(10, cancellationToken);
        } catch (OperationCanceledException) {
            yield break;
        }
        rand.NextBytes(buffer);
        yield return (i, new Memory<byte>(buffer));
    }
}

readonly struct Disposer : IDisposable {
    readonly Action _disposeAction;
    public Disposer(Action disposeAction)
        => _disposeAction = disposeAction;
    public void Dispose() {
        _disposeAction();
    }
}

Idea 2: Convert the Disposer struct to a class with a finalizer method in the hope that its finalizer might get triggered: Also fail.

class Disposer : IDisposable {
    readonly Action _disposeAction;
    public Disposer(Action disposeAction)
        => _disposeAction = disposeAction;
    public void Dispose() {
        _disposeAction();
        GC.SuppressFinalize(this);
    }
    ~Disposer() {
        _disposeAction();
    }
}

Aside from writing my own enumerator class from scratch, How can I make this compiler-generated enumerator always run its cleanup code, even in the finalizer thread when it has not been correctly disposed?

bboyle1234
  • 4,859
  • 2
  • 24
  • 29
  • "regardless of whether the enumerator it returns is disposed of correctly". I stopped reading after that. Failing to dispose of the enumerator is a bug. IMHO, don't try to work around that. – Jeremy Lakeman Jan 31 '21 at 23:51
  • There is no guarantee anymore even when calling `WaitForPendingFinalizers` see https://github.com/dotnet/runtime/issues/9328 Try running in Release build, also try `GC.Collect(2);` – Charlieface Jan 31 '21 at 23:52
  • It seems that "best practice" is probably still https://stackoverflow.com/questions/4737056/detecting-leaked-idisposable-objects. Run a memory profiler after your tests to detect leaked objects. – Jeremy Lakeman Jan 31 '21 at 23:55
  • "on my machine" your second idea works. – Guru Stron Feb 01 '21 at 00:01
  • @GuruStron, can I have a loan of your machine please mate? :D – bboyle1234 Feb 01 '21 at 00:36
  • @GuruStron, more seriously, what framework are you targeting? I've tested on net5 and netcoreapp3.1, both Debug and Release builds – bboyle1234 Feb 01 '21 at 00:47
  • 1
    As per `Memory` [constructor](https://github.com/dotnet/runtime/blob/3d8d29731b62c6bdc37c6fb3e22ef8bcdd5ec99b/src/libraries/System.Private.CoreLib/src/System/Memory.cs#L42) you're not allowed to yield the pooled array because consumer of the Enumerable normally expects an exclusive copy of the array for the each yielded item. Design issue. Also consumer may still keep a reference to the array and use it while enumerator already returned it to the pool e.g. after it was yielded the last item. The possible fix is managing the buffer array outside of enumerator. – aepot Feb 01 '21 at 01:26
  • @aepot you are right. This particular api expects the user to handle the `Memory` inside the interator loop and NOT, for example, add it to a list for later use. That's for performance reasons. `System.IO.Pipes` has the same expectations of users when reading a pipe. – bboyle1234 Feb 01 '21 at 01:41
  • In other words where the guarantee that consumer already handled the previous yielded buffer before the producer fill it with the new data? Possible data corruption? The design of this enumerator tells that only one buffer can be in use at once. What the sense of async implementation then? No sense if you produce-consume-produce-consume-etc. sequentially. Sequential flow is more looks like a sync job rather than async. – aepot Feb 01 '21 at 01:42
  • I suggest you to look at `System.Threading.Channels` to pass the data in async manner. E.g. make possible multiple buffers alive even pooled ones. Producer rents, consumer returns = the pool is happy and GC is idle. :) and finally no possible data corruption as every buffer is exclusive array. – aepot Feb 01 '21 at 01:46

1 Answers1

2

My IAsyncEnumerable function must run its cleanup code regardless of whether the enumerator it returns is disposed of correctly.

Full stop. You can't have any type run managed cleanup code regardless of whether it is disposed correctly. This is simply not possible in .NET.

You can write a finalizer, but finalizers cannot have arbitrary code. Usually they are restricted to accessing value type members and doing some p/Invoke-style calls. They cannot, say, return a buffer to an array pool. More generally, with few exceptions, they cannot call any managed code at all. They're really only intended to clean up unmanaged resources.

So this doesn't have anything to do with asynchronous enumerators. You can't guarantee cleanup code will be run if the object isn't disposed, and this is the case for any kind of object.

The best solution is to run the cleanup code when it is disposed (e.g., in a finally block in an async enumerator function). Any code that doesn't dispose that is responsible for creating the resource leak. This is the way all other .NET code works, and this is the way async enumerators work, too.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
  • Then run your test cases or your whole application, use a memory profiler to look for leaked disposables, then add using / dispose methods to fix your code.... eg this is an engineering problem, not a runtime problem that you should try to work around. – Jeremy Lakeman Feb 01 '21 at 02:41
  • If this IS the design principle, summarized as, "there is no backup for users who forget to dispose", why does it not also apply to unmanaged resources? All the documentation I can find suggests using the finalizer to make sure disposal happens when the resource is unmanaged. – bboyle1234 Feb 01 '21 at 23:37
  • Here's an example from SharpDX's DisposeBase class, who not only has a backup disposal in the finalizer, but also notification of disposal event listeners. SharpDX is a very commonly-used library. https://github.com/sharpdx/SharpDX/blob/master/Source/SharpDX/DisposeBase.cs#L24 – bboyle1234 Feb 01 '21 at 23:38
  • @bboyle1234: That code is completely wrong, popularity notwithstanding. – Stephen Cleary Feb 02 '21 at 00:32
  • @StephenCleary, I believe inheriting classes use it to release unmanaged resources. Would it still be wrong in this situation? – bboyle1234 Feb 02 '21 at 01:08
  • @bboyle1234: You mean they do it via event handlers on their own instance? That could perhaps work. But an event handler that referenced any other instance would not be valid. – Stephen Cleary Feb 02 '21 at 02:56
  • @StephenCleary I'm not sure, I didn't dig quite that far. I just know that disposable SharpDX objects wrap unmanaged directX objects, and it's important to get them released ... and the SharpDX writers probably figured their users would be application developers who aren't going to faithfully dispose everything. – bboyle1234 Feb 02 '21 at 03:44