I believe that I understand what a closure is for an anonymous function and am familiar with the traditional pitfalls. Good questions covering this topic are here and here. The purpose is not to understand why or how this works in a general sense but to suss out intricacies I may be unaware of when depending on the behavior of generated closure class references. Specifically, what pitfalls exist when reporting on the behavior of an externally modified variable captured in a closure?
Example
I have a long-running, massively concurrent worker service that has exactly one error case - when it cannot retrieve work. The degree of concurrency (number of conceptual threads to use) is configurable. Note, conceptual threads are implemented as Tasks<> via the TPL. Because the service constantly loops trying to get work when multiplied by the unknown degree of concurrency this can mean thousands to tens of thousands of errors could be generated per second.
As such, I need a reporting mechanism that is time-bound rather than attempt-bound, that is isolated to its own conceptual thread, and that is cancellable. To that end, I devised a recursive Task lambda that accesses my fault counter every 5 minutes outside of the primary attempt-based looping that is trying to get work:
var faults = 1;
Action<Task> reportDelay = null;
reportDelay =
// 300000 is 5 min
task => Task.Delay(300000, cancellationToken).ContinueWith(
subsequentTask =>
{
// `faults` is modified outside the anon method
Logger.Error(
$"{faults} failed attempts to get work since the last known success.");
reportDelay(subsequentTask);
},
cancellationToken);
// start the report task - runs concurrently with below
reportDelay.Invoke(Task.CompletedTask);
// example get work loop for context
while (true)
{
object work = null;
try
{
work = await GetWork();
cancellationToken.Cancel();
return work;
}
catch
{
faults++;
}
}
Concerns
I understand that, in this case, the generated closure with point by reference to my faults
variable (which is incremented whenever any conceptual thread attempts to get work but can't). I likewise understand that this is generally discouraged, but from what I can tell only because it leads to unexpected behaviors when coded expecting the closure to capture a value.
Here, I want and rely on the closure capturing the faults
variable by reference. I want to report the value of the variable around the time the continuation is called (it does not have to be exact). I am mildly concerned about faults
being prematurely GC'd but I cancel the loop before exiting that lexical scope making me think it should be safe. Is there anything else I'm not thinking of? What dangers are there when considering closure access outside of mutability of the underlying value?
Answer and Explanation
I have accepted an answer below that refactors the code to avoid the need for closure access by reifying the fault monitor into its own class. However, because this does not answer the question directly, I will include a brief explanation here for future readers of the reliable behavior:
So long as the closed-over variable remains in scope for the life of the closure, it can be relied upon to behave as a true reference variable. The dangers of accessing a variable modified in an outer scope from within a closure are:
- You must understand that the variable will behave as a reference within the closure, mutating its value as it is modified in the outer scope. The closure variable will always contain the current runtime value of the outer scope variable, not the value at the time the closure is generated.
- You must write your program in such a way as to garuantee that the lifetime of the exterior variable is the same or greater than the anonymous function/closure itself. If you garbage collect the outer variable then the reference will become an invalid pointer.