15

For example,

public int DoSomething(in SomeType something){
  int local(){
     return something.anInt;
  }
  return local();
}

Why does the compiler issue an error that the something variable cannot be used in the local function?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Frank
  • 903
  • 7
  • 14
  • 1
    "because the language is designed that way" / "the language has not implemented this feature" would be the basic answer. What are you actually trying to achieve that is blocked by this constraint? – Pac0 Aug 14 '22 at 14:37
  • @Pac0 Something very hard to explain that is way outside the usual parquet of C# devs. I am actually implementing a C# compiler , but it uses pre-verified C# syntax with Roslyn as input, and this particular quirk doesn't fit with my needs. But it does make me wonder what I am missing, as if there is some edge case here then I will probably need to understand it for what I am doing too – Frank Aug 14 '22 at 14:42
  • @Pac0 If I had to guess, I'd say it has something to do with optimisation...that the local function is not permitted to 'see' something that is left in the originating/declaring scope... – Frank Aug 14 '22 at 14:44
  • Fair enough for your need of understanding the rationale. I don't see right now a specific theoretical point that would make this impossible (though, maybe there is). Maybe this question will receive an enlightening answer! But keep in mind that in a language, features are unimplemented by default. It's some work to analyze, prioritize, design, etc. to add fancy perks to the language. So it may very well that C#doesn't allow it simply because... it has not been implemented in the language (yet?). See this Eric Lippert answer/disgression here https://stackoverflow.com/a/8673015/479251 – Pac0 Aug 14 '22 at 14:46
  • 1
    making me think that it would be a good question to ask on their github (from roslyn or c#language?). This could become a feature request. – Pac0 Aug 14 '22 at 14:52
  • @Pac0 I will probably ask on their Discord . I might get a "because Eric was drunk that night" kind of answer that would actually be enormously satisfying if true. – Frank Aug 14 '22 at 15:57
  • C# "local" functions are just [nested-functions], right? Is there some useful distinction between that tag and [local-functions], or is the idea to have a separate tag for the same concept in C# vs. in other languages, like it could have been [c#-local-functions] vs. [GNU-C-nested-functions] vs. [pascal-nested-functions]? Anyway, I suspect these tags should be made synonyms, unless there's a distinction I'm missing. (@Charlieface) – Peter Cordes Aug 15 '22 at 02:42
  • @PeterCordes I suggest you put that on [meta.so], personally I think they should remain separate, as C# local functions have their own quirk as you can see. – Charlieface Aug 15 '22 at 03:11
  • @Charlieface: Every language has its own quirks. It's a tag that already needs to be used with a language to make sense, like [C#][nested-functions] vs. [pascal][nested-functions]. But yeah, should probably get discussed on meta.SO for people to weigh in with more detail. – Peter Cordes Aug 15 '22 at 03:22
  • @Charlieface: Posted [Should \[local-functions\] be a synonym of \[nested-function\], or do C# local functions warrant a separate tag?](https://meta.stackoverflow.com/q/419841) on meta. – Peter Cordes Aug 15 '22 at 04:37

1 Answers1

14

The documentation on local functions states the following

Variable capture

Note that when a local function captures variables in the enclosing scope, the local function is implemented as a delegate type.

And looking at lambdas:

Capture of outer variables and variable scope in lambda expressions

A lambda expression can't directly capture an in, ref, or out parameter from the enclosing method.

The reason is simple: it's not possible to lift these parameters into a class, due to ref escaping problems. And that is what would be necessary to do in order to capture it.

Example

public Func<int> DoSomething(in SomeType something){
  int local(){
     return something.anInt;
  }
  return local;
}

Suppose this function is called like this:

public Func<int> Mystery()
{
    SomeType ghost = new SomeType();
    return DoSomething(ghost);
}

public void Scary()
{
    var later = Mystery();
    Thread.Sleep(5000);
    later(); // oops
}

The Mystery function creates a ghost and passes it as an in parameter to DoSomething, which means that it is passed as a read-only reference to the ghost variable.

The DoSomething function captures this reference into the local function local, and then returns that function as a Func<int> delegate.

When the Mystery function returns, the ghost variable no longer exists. The Scary function then uses the delegate to call the local function, and local will try to read the anInt property from a nonexistent variable. Oops.

The "You may not capture reference parameters (in, out, ref) in delegates" rule prevents this problem.

You can work around this problem by making a copy of the in parameter and capturing the copy:

public Func<int> DoSomething2(in SomeType something){
  var copy = something;
  int local(){
     return copy.anInt;
  }
  return local;
}

Note that the returned delegate operates on the copy, not on the original ghost. It means that the delegate will always have a valid copy to get anInt from. However, it means that any future changes to ghost will have no effect on the copy.

public int Mystery()
{
    SomeType ghost = new SomeType() { anInt = 42 };
    var later = DoSomething2(ghost);
    ghost = new SomeType() { anInt = -1 };
    return later(); // returns 42, not -1
}
Raymond Chen
  • 44,448
  • 11
  • 96
  • 135
Charlieface
  • 52,284
  • 6
  • 19
  • 43
  • 2
    This seems like a good start to explaining the issue, but "`ref` escaping problems" is a bit hand-wavy for my taste. What exactly is the connection between delegates and lambdas here? Is the restriction on lambdas itself arbitrary, or what underlies that technically? – Karl Knechtel Aug 14 '22 at 15:41
  • Hmm, so if i understand this correctly, the 'in' prevention is to cover the scenario that the method returns the local function itself, and only that scenario? – Frank Aug 14 '22 at 16:16
  • Yeah this does not really clarify things to me, but I feel I might be looking at this wrong. The way I see it in your above example the compiler should issue an error on "return local;" as that is the problem. Granted, the way it is is the way it is, but this looks to me like a lazy compiler implementation, rather than having a genuine rationale. – Frank Aug 14 '22 at 16:25
  • 1
    @KarlKnechtel A lambda which captures variables gets converted into a class, where the fields contain the captured variables. `ref` escaping is well documented: you cannot have a `ref` field in a class, it's only allowed as a local variable. The first quote I gave shows that local functions become lambdas if there are captured variables – Charlieface Aug 14 '22 at 16:27
  • 3
    Delaying enforcement until `return local;` means that the compiler must do escape analysis: "Is it possible for `local` to be called after this function returns?" Escape analysis is hard. For example, would `list.Mystery(e => e.value == local())` be safe? You don't know, because you don't know what `Mystery` does. You would have to create more and more complex rules to allow this if `Mystery` is `Select` or `Where`, but not other methods, and when you're done it's not clear that you made things any better. Simple rules are simple to explain and to understand. – Raymond Chen Aug 14 '22 at 16:30
  • @RaymondChen Sure. My use of the word "lazy" was perhaps a bit blunt, but essentially you are saying what I am thinking :-) Still - the original code in the original question *should* compile IMO. Whatever it takes to make that happen, even if the compiler specifically checks for that case, should be implemented IMO – Frank Aug 14 '22 at 16:32
  • @KarlKnechtel Lambdas converting to classes - didnt know that. Thx! – Frank Aug 14 '22 at 16:33
  • 1
    Note also that "lazy compiler implementation" is thinking about the problem at the wrong level. This is not an implementation question. This is a language question. What you definitely don't want is "Some compilers accept this code, but others reject it." Or "This code compiles only if you set optimization level to 2 or higher (when escape analysis kicks in), but other level 2 optimizations make our problem nearly impossible to debug." The rules for what constitutes a syntactically legal program need to be independent of implementation. – Raymond Chen Aug 14 '22 at 16:36
  • @RaymondChen Is this the case? What is the C# lang spec equivalent for this comment "Note that when a local function captures variables in the enclosing scope, the local function is implemented as a delegate type."? Note the use of the word implemented. – Frank Aug 14 '22 at 16:41
  • Yes lambdas must convert to classes, if you think about it there is no other way to capture a variable. I can't find local functions in the spec, I suspect it hasn't been updated yet. @RaymondChen The spec proposal does not seem to have it either? https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-7.0/local-functions – Charlieface Aug 14 '22 at 16:48
  • OK thanks all. I feel like this has answered and clarified a lot. – Frank Aug 14 '22 at 16:49
  • @Charlieface Partly in jest, but when they do write the spec, please could they write in the form "local functions are captured as classes and no compiler can handle in/ref etc etc APART FROM the case that Frank mentioned on SO issue #12234" Much appreciated. – Frank Aug 14 '22 at 16:53
  • I think the only thing they could do is potentially allow capturing `ref` and `in` parameters when no delegate is generated, but it's rather messy to define, and I think it unlikely that they would do so – Charlieface Aug 14 '22 at 16:55
  • @Charlieface Makes sense. Many thanks for your clarifications. Very enlightening. – Frank Aug 14 '22 at 16:57
  • 1
    However [Implementation as a delegate](https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/local-functions#implementation-as-a-delegate) (just before the "Variable capture" you are linking to) states *"Local functions are more flexible in that they can be written like a traditional method or as a delegate. Local functions are **only** converted to delegates **when used** as a delegate."*. So I don't think the delegate variable capturing rules really explain the limitation in question. Also, even for delegates, one can understand `ref` and `out` behavior... – Ivan Stoev Aug 15 '22 at 08:45
  • ... but not `in`. `in` is not intended to provide `ref` semantics like the other two modifiers, so capturing it like "regular" (not `in`) variable shouldn't be a problem (technically), except if we don't see something. Which would be good to be explained in the docs/specs. – Ivan Stoev Aug 15 '22 at 08:48
  • That's not true: `in` is supposed to provide reference semantics to the called function, so that it can read the original location if there were changes to it on another thread. That is not possible to do if it's lifted to a field – Charlieface Aug 15 '22 at 08:50
  • @Charlieface But still, Ivan faces the same ambiguities...I wouldn't be so stalwart...the spec is simply absent and whichever way you look at it, you *could* force the compiler to be lenient towards the *in* and the compiler probably *should*. It really does lean more towards that the compiler and language spec are both at fault. – Frank Aug 19 '22 at 19:44
  • Not sure which bit of the spec you are referring to. The "local functions" bit hasn't been updated yet anyway, and the proposal indicates it behaves like a lambda. To allow it to not behave like a lambda is possible, but pretty confusing, so I doubt that would ever be relaxed. That a lambda cannot use `in` or `ref` or `out` variables from the enclosing scope is already explained, it's simply impossible. – Charlieface Aug 20 '22 at 21:31
  • @IvanStoev 's point about them being *only* converted to a delegate when used as such, refers specifically to the *implementation* (ie generating a delegate object on the heap etc), not the *semantics*, which do not appear to change. – Charlieface Aug 20 '22 at 21:33