Why can't the compiler optimize closure variable by inlining?

Question

I have a Main method like this:

static void Main(string[] args)
{
     var b = new byte[1024 * 1024];

     Func<double> f = () =>
     {
         new Random().NextBytes(b);
         return b.Cast<int>().Average();
     };

     var avg = f();
     Console.WriteLine(avg);
}

Since I am accessing a local variable b here the compiler creates a class to capture that variable and b becomes the field of that class. Then the b lives as long as the life time of the compiler generated class and it causes a memory leak. Even if b goes out of scope (maybe not in this situation but imagine this is inside of another method and not Main), the byte array won't be deallocated.

What I wonder is, since I am not accessing or modifying the b anywhere after declaring Func, why can't the compiler inline that local variable and not bother with creating a class? Like this:

Func<double> f = () =>
{
    var b = new byte[1024 * 1024];
    new Random().NextBytes(b);
    return b.Cast<int>().Average();
};

I compiled this code in Debug and Release modes, the DisplayClass is generated in both:

Is this just not implemented as an optimization or is there anything I am missing?

tip: use a local function instead of a delegate, and the compiler generates the capture type as a struct instead of a class: https://sharplab.io/#v2:EYLgtghgzgLgpgJwD4AEAMACFBGA3AWACh0tsA6AGQEsA7ARwOIGYsAmDAYQwG8iN/SANiwAWDAFlsACgCUfAb0IDl/AG4QEGYBgC8GGnADuWgJ7wA2tjSsxAKgxWbAXUbyV/FAFYAPABMA9gCuwAA2cAB8GABmuhiyuuFu7oruqQbGAEoQNAFgsmQAcnAAHjAAQmZwUFLAMoyp7igA7FpkHNAw3rQw4fkAgqqIEADmcLL1DQC+rkoN6poQqsOxUeNJKjgAnFKLw3VJk0k4wihi4qyySSlzGlqx6aYWjnYO1iIu6wKfAgHBYdGXWapa4NAQPLI5fx5GSFErlSrVWoTUEeFrANodLo0Hr9QYIEZjfZA9zTb4YMkYeYYXYrNbEjbYba7InKQ6ESZAA — Marc Gravell, Aug 30 '18 at 00:05

Eric Lippert · Accepted Answer · 2018-08-30T00:23:49.373

Is this just not implemented as an optimization or is there anything I am missing?

For the specific example you give, you'd probably not want to make that code transformation because it changes the semantics of the program. If the new throws an exception, in the original program it should do so before the execution of the delegate, and in your transformation, the side effect is deferred. Whether that's an important property that should be preserved is debatable. (And doing so also creates problems for the debugger; the debugger already must pretend that elements of closure classes are locals of the containing method body, and this optimization might complicate it further.)

However, the more general point is germane. There are many optimizations you can do if you know that a closed-over variable is only used for its value.

When I was on the compiler team -- I left in 2012 -- Neal Gafter and I considered implementing such optimizations, as well as a number of more complex optimizations designed to reduce the likelihood of an expensive object's lifetime being extended too long by accident.

Aside: The simplest of the more complex scenarios is: we have two lambdas converted to delegates; one is stored in a short-lived variable and is closed over a local that contains a reference to an expensive object; one is stored in a long-lived variable and is closed over a local that refers to a cheap object. The expensive object lives as long as the long-lived variable even though it is not used. More generally, multiple closures could be constructed as a partition based on the closed-over relation; at the time we only partitioned closures based on nesting; closures at the same nesting level were one closure. The given scenario is rare and there are obvious workarounds, but it would be nice if it didn't happen at all.

We did not do so because there were more important optimizations and features during the period that we were implementing Roslyn, and we did not want to add risk to an already-long schedule.

We could perform such optimizations confidently because in C# it is pretty easy to know when a local has been aliased, and so you can know for sure whether it is ever written to after the closure is created.

I do not know if those optimizations have been implemented in the meanwhile; likely not.

I also do not know if the compiler does such optimizations for C# 7 local functions, though I suspect the answer is "yes". See what happens if you try a local function!

re local functions: it becomes a struct: https://sharplab.io/#v2:EYLgtghgzgLgpgJwD4AEAMACFBGA3AWACh0tsA6AGQEsA7ARwOIGYsAmDAYQwG8iN/SANiwAWDAFlsACgCUfAb0IDl/AG4QEGYBgC8GGnADuWgJ7wA2tjSsxAKgxWbAXUbyV/FAFYAPABMA9gCuwAA2cAB8GABmuhiyuuFu7oruqQbGAEoQNAFgsmQAcnAAHjAAQmZwUFLAMoyp7igA7FpkHNAw3rQw4fkAgqqIEADmcLL1DQC+rkoN6poQqsOxUeNJKjgAnFKLw3VJk0k4wihi4qyySSlzGlqx6aYWjnYO1iIu6wKfAgHBYdGXWapa4NAQPLI5fx5GSFErlSrVWoTUEeFrANodLo0Hr9QYIEZjfZA9zTb4YMkYeYYXYrNbEjbYba7InKQ6ESZAA — Marc Gravell, Aug 30 '18 at 00:05
I have tried local function, as @MarcGravell pointed out, the compiler creates a struct instead. But I guess there is still an opportunity to optimize that the same way I suggested. if I'm not accessing the local variable outside of the local function it can be inlined. — Selman Genç, Aug 30 '18 at 00:10
thanks for the answer @EricLippert, I am trying to understand this statement: "multiple closures could be constructed as a partition based on the closed-over relation;" but can't visualize it in my head, can you elaborate it? a short code example might help to me and to the future readers :) — Selman Genç, Aug 30 '18 at 00:48
I gave an example of the compiler failing to construct multiple closures at https://stackoverflow.com/q/3885106/18192, which @EricLippert summarized in his answer. — Brian, Aug 30 '18 at 12:56
@SelmanGenç: What I'm getting at is: consider my example: lambda 1 is closed over local x, lamba 2 is closed over y, we make a single closure for x and y. We could make one closure for x and one for y. Now suppose we have lambda 1 closed over x and y, lambda 2 closed over x, y and z, lambda 3 closed over z. What's the right number of closures to make that minimizes both the number of closures created (because we don't want GC pressure) and ensures that no lambda keeps a variable alive too long? That's the partition problem to solve if you want to do it right. — Eric Lippert, Aug 30 '18 at 16:59
That's more clear now, thank you. Handling closures seems like a challenging task for the compiler. — Selman Genç, Aug 30 '18 at 17:29

Why can't the compiler optimize closure variable by inlining?

1 Answers1