107

I know this may sound strange but I don't know even how to search this syntax in internet and also I am not sure what exactly means.

So I've watched over some MoreLINQ code and then I noticed this method

public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source,
        Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
{
    if (source == null) throw new ArgumentNullException(nameof(source));
    if (keySelector == null) throw new ArgumentNullException(nameof(keySelector));

    return _(); IEnumerable<TSource> _()
    {
        var knownKeys = new HashSet<TKey>(comparer);
        foreach (var element in source)
        {
            if (knownKeys.Add(keySelector(element)))
                yield return element;
        }
    }
}

What is this odd return statement? return _(); ?

kuskmen
  • 3,648
  • 4
  • 27
  • 54
  • Do you mean the fact it says "yield return" rather than just "return"? If so searching "yield return" or "yield c#" will get you useful results such as https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/yield . – Chris Jul 26 '17 at 10:02
  • 6
    Or do you mean: `return _(); IEnumerable _()` ? – Alex K. Jul 26 '17 at 10:02
  • 6
    @Steve, I wonder if the OP is referring more to the `return _(); IEnumerable _()` than the `yield return` ? – Rob Jul 26 '17 at 10:03
  • 2
    That is a local function inside a method. OP is referring not on the yield return. See here: [local function C#7](https://blogs.msdn.microsoft.com/dotnet/2016/08/24/whats-new-in-csharp-7-0/) – Christoph K Jul 26 '17 at 10:03
  • 2
    @Steve did you even read full code? Op has not asked anything about yield. – Akash Kava Jul 26 '17 at 10:04
  • 5
    I think he meant this line `return _(); IEnumerable _()`. He could be confused by the way it looks like rather than the actual return statement. – mrogal.ski Jul 26 '17 at 10:05
  • 5
    @AkashKava The OP said there was an odd return statement. Unfortunately, the code contains **two** return statements. So it is understandable if people are confused as to which he/she is referring to. – mjwills Jul 26 '17 at 10:05
  • 1
    I voted to reopen, as i think OP really meant the `return _(); IEnumerable _()`. But maybe the question should be clarified... – Pikoh Jul 26 '17 at 10:07
  • If he is talking about `return _()` this isn't a duplicate. – Stuart Jul 26 '17 at 10:07
  • Sorry guys, I am referring to 'return _();' I know the history behind yield return. – kuskmen Jul 26 '17 at 10:08
  • Having read comments I would agree that this may not be a duplicate as first assumed. As mjwills says though the question wasn't very clear. If the OP clears up the confusion on what they are asking about I will be more than happy to reopen if appropriate. – Chris Jul 26 '17 at 10:08
  • 5
    Edited the question, and once again sorry for the confusion. – kuskmen Jul 26 '17 at 10:10
  • 3
    @kuskmen: No need to be sorry. Its all a learning experience and it all just shows the system works! Comments, closes, reopens and a better question with a good answer! :) – Chris Jul 26 '17 at 10:11
  • 2
    This is bad style. The local function called `_` should have an informative name. Also its declaration should start on a new line. – Julien Couvreur Jul 29 '17 at 06:41
  • 1
    @JulienCouvreur agreed. Those are the only weird things about this code. Who the heck starts a function declaration on the same line as a return?? Could also use a lambda unless I'm missing something. – Kat Jul 31 '17 at 19:15

2 Answers2

107

This is C# 7.0 which supports local functions....

public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
       this IEnumerable<TSource> source,
        Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
    {
        if (source == null) throw new 
           ArgumentNullException(nameof(source));
        if (keySelector == null) throw 
             new ArgumentNullException(nameof(keySelector));

        // This is basically executing _LocalFunction()
        return _LocalFunction(); 

        // This is a new inline method, 
        // return within this is only within scope of
        // this method
        IEnumerable<TSource> _LocalFunction()
        {
            var knownKeys = new HashSet<TKey>(comparer);
            foreach (var element in source)
            {
                if (knownKeys.Add(keySelector(element)))
                    yield return element;
            }
        }
    }

Current C# with Func<T>

public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
       this IEnumerable<TSource> source,
        Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
    {
        if (source == null) throw new 
           ArgumentNullException(nameof(source));
        if (keySelector == null) throw 
             new ArgumentNullException(nameof(keySelector));

        Func<IEnumerable<TSource>> func = () => {
            var knownKeys = new HashSet<TKey>(comparer);
            foreach (var element in source)
            {
                if (knownKeys.Add(keySelector(element)))
                    yield return element;
            }
       };

        // This is basically executing func
        return func(); 

    }

The trick is, _() is declared after it is used, which is perfectly fine.

Pratical use of local functions

Above example is just a demonstration of how inline method can be used, but most likely if you are going to invoke method just once, then it is of no use.

But in example above, as mentioned in comments by Phoshi and Luaan, there is an advantage of using local function. Since function with yield return will not be executed unless someone iterates it, in this case method outside local function will be executed and parameter validation will be performed even if no one will iterate the value.

Many times we have repeated code in method, lets look at this example..

  public void ValidateCustomer(Customer customer){

      if( string.IsNullOrEmpty( customer.FirstName )){
           string error = "Firstname cannot be empty";
           customer.ValidationErrors.Add(error);
           ErrorLogger.Log(error);
           throw new ValidationError(error);
      }

      if( string.IsNullOrEmpty( customer.LastName )){
           string error = "Lastname cannot be empty";
           customer.ValidationErrors.Add(error);
           ErrorLogger.Log(error);
           throw new ValidationError(error);
      }

      ... on  and on... 
  }

I could optimize this with...

  public void ValidateCustomer(Customer customer){

      void _validate(string value, string error){
           if(!string.IsNullOrWhitespace(value)){

              // i can easily reference customer here
              customer.ValidationErrors.Add(error);

              ErrorLogger.Log(error);
              throw new ValidationError(error);                   
           }
      }

      _validate(customer.FirstName, "Firstname cannot be empty");
      _validate(customer.LastName, "Lastname cannot be empty");
      ... on  and on... 
  }
Akash Kava
  • 39,066
  • 20
  • 121
  • 167
  • 4
    @ZoharPeled Well.. the posted code *does* show a use for the function.. :) – Rob Jul 26 '17 at 10:13
  • second code snippet, in the comment above `return` you wrote `...executing _AnonymousFunction()` which would be better if you would've wrote `..executing func()` as it's the name of that `Func<>`. Might confuse some newbie looking at something like this for the first time – mrogal.ski Jul 26 '17 at 10:14
  • 1
    I mean, how is that different then writing the code inside the main method? I mean, the same thing could be done without the anonymous function, or am I wrong about that? – Zohar Peled Jul 26 '17 at 10:14
  • @ZoharPeled many times we have repeated code inside a method itself, creating new function and passing all variables is difficult, so we can declare a function and reuse many times, this example is just a demonstration, if you are not going to invoke method more then once then there is no use of such method. – Akash Kava Jul 26 '17 at 10:15
  • 1
    @ZoharPeled It could not in this case, because it's leveraging the syntactic sugar of `yield`. Yes, there is a way to write it explicitly, but it's far simpler here. – Rob Jul 26 '17 at 10:15
  • Ok, I'm convinced :-) – Zohar Peled Jul 26 '17 at 10:16
  • 1
    @AkashKava what is the benefit of doing this over breaking it out into another reusable method? Excuse any discussion here but I have always failed to see the purpose, and testability, of local/inline functions. – ColinM Jul 26 '17 at 10:16
  • @ColinM Some methods don't really make sense to be re-used elsewhere. There are admittedly few cases where it's useful to use a local anonymous function, but there *are* case, and I believe the code posted by the OP is one of them. – Rob Jul 26 '17 at 10:17
  • Omg, I knew about local functions in c# 7.0, but the way code was formatted tricked me and I couldn't see it .. I hope this kind of formatting doesn't become practice or even worse .. standart. – kuskmen Jul 26 '17 at 10:18
  • 2
    @ColinM one of the benefits is that the anonymous function can easily access variables from its 'host'. – mjwills Jul 26 '17 at 10:19
  • @ColinM check my example, for very small fragment of repeated code, you don't need test-ability. – Akash Kava Jul 26 '17 at 10:22
  • That makes sense, instead of checking 10 parameters for null etc then it can just be a local method to reduce repeatability. I guess I was thinking of more complex cases. – ColinM Jul 26 '17 at 10:37
  • 6
    Are you sure that in C#-speak this is actually called an anonymous function? It seems to have a name, namely `_AnonymousFunction` or just `_`, while I'd expect a genuine anonymous function to be something like `(x,y) => x+y`. I would call this a local function, but I'm not used to C# terminology. – chi Jul 26 '17 at 11:57
  • 1
    @chi You are correct. The proper terminology is [local functions.](https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/local-functions) – Kris Harper Jul 26 '17 at 13:36
  • 12
    To be explicit, as nobody seems to have pointed it out, this code snippet is using the local function because it is an _iterator_ (note the yield), and so executes lazily. Without the local function you would need to either accept that input validation happens on first use, or have a method which will only ever be called by one other method lying around for very little reason. – Phoshi Jul 26 '17 at 14:22
  • 6
    @ColinM The example kuksmen posted is actually one of the main reasons this was finally implemented - when you make a function with `yield return`, no code is executed until the enumerable is actually enumerated. This is undesirable, since you want to e.g. verify arguments right away. The only way to do this in C# is by separating the method to two methods - one with `yield return`s, and the other without. Inline methods allows you to declare the `yield` using method *inside*, avoiding clutter and potential misuse of a method that's strictly internal to its parent and not reusable. – Luaan Jul 26 '17 at 14:40
  • 2
    @ZoharPeled One example: suppose you had a recursive function `f` that has a set of preconditions on its inputs. If you put this validation in `f`, then you redundantly repeat it on every recursion. A typical solution to this is to create a private "helper function" that `f` calls, which actually does the recursion, sans input validation. With language support for local functions, you can hide that function inside `f` (as no one should directly be calling it, except `f`). [Here's an example.](https://stackoverflow.com/a/43549117/3141234) It's Swift, but the essence is the same. – Alexander Jul 27 '17 at 04:23
24

Consider the simpler example

void Main()
{
    Console.WriteLine(Foo()); // Prints 5
}

public static int Foo()
{
    return _();

    // declare the body of _()
    int _()
    {
        return 5;
    }
}

_() is a local function declared within the method containing the return statement.

Stuart
  • 3,949
  • 7
  • 29
  • 58
  • 3
    Yes I know about local functions it was the formatting that fooled me ... hope this does not become standart. – kuskmen Jul 26 '17 at 10:19
  • 21
    Do you mean the function declaration starting on the same line? If so, I agree, it's horrible! – Stuart Jul 26 '17 at 10:20
  • 3
    Yes, that's what I meant. – kuskmen Jul 26 '17 at 10:21
  • 10
    Except for that naming it underscore is horrible as well – Icepickle Jul 26 '17 at 17:41
  • @Stuart It is not horrible, in C# statements are ended with semi colon `;` so whether you write next line in new line or not that is completely by choice. You can ideally write many statements in one line. – Akash Kava Aug 02 '17 at 08:06
  • 2
    @AkashKava: the question is not whether it is legal C#, but whether the code is easy to understand (and hence easy to maintain and pleasing to read) when formatted like this. Personal preferences play a role, but I tend to agree with Stuart. – PJTraill Aug 02 '17 at 09:40
  • @PJTraill actually no one formats code like this, this is kind of good for tricky interview question. – Akash Kava Aug 05 '17 at 06:38