How does Linq know the name of an element?

Question

I’ve solved a C# beginner Kata on Codewars, asking to return a string[] with only 4-character names. I used a list that is filled in an if statement and then converted back to a string and returned.

My question is regarding the best practice solution below, which is presented later.

I understand that the same string[] that comes as an argument is refilled with elements and returned. But how does the program know that each element of the array is called “name”, as it’s never mentioned before?

Does linq know that a variable with a singular name is an element of a plural name group, here “names”?

Thanks for helping out!

using System;
using System.Collections.Generic;
using System.Linq;

public static class Kata {
  public static IEnumerable<string> FriendOrFoe (string[] names) {
    return names.Where(name => name.Length == 4);
  }
}

`name => name.Length == 4` is a one-argument delegate. You've named the argument `name`, but it would work just as well if it was `banana`. The individual elements of the array `names` have no names of their own (and of course the array argument is named `names` only because it's declared that way as well, `giraffes` would also work). — Jeroen Mostert, Dec 16 '21 at 13:33
It knows it's called `name` because you called it that in the part before the `=>`. You could change `name => name.Length == 4` to `turnip => turnip.Length == 4` and it would work exactly the same. — DavidG, Dec 16 '21 at 13:34
I voted to reopen this question because I think the duplicate is a poor fit. The question which has been selected as a duplicate doesn't ask the same question as this one, and none of the answers are answers to this question. In addition it's a closed question, and 1 of the 3 duplicates to which it points is deleted. The other 2 are general questions about lambdas and none of the answers obviously answer this question. It's possible the answer to this question is buried somewhere in that lot, but it's certainly not easy to find. — Tim, Dec 16 '21 at 13:45
@Tim In what way do they fail to answer the question of how the parameter(s) used in the lambda are defined? There were plenty of answers stating that. How was that hard to find for you? There are three separate answers all describing that. It's not like it's hidden. — Servy, Dec 16 '21 at 13:49
How to use the parameter is indeed stated, but that's not this question. This question is about the naming of the parameter. The first two comments on this question do answer it clearly and succinctly, but I don't think that information is obvious in the duplicate target and its duplicate targets. — Tim, Dec 16 '21 at 13:56
To clarify, looking again at the answers on the dupe, they all use `x` for their lambda variable but none of them state why they use `x`, or give advice how to write a suitable name for the variable. I think it's a perfectly reasonable (and distinct) question for someone new to lambda syntax to wonder how the original author of the code knew what variable name to use as there is no `var x`/`var name` statement anywhere. — Tim, Dec 16 '21 at 13:59
@Tim Considering the answers in the duplicate state the same thing as those comments, why do you think the duplicate doesn't answer the question? The question asks where the variable is defined, the duplicate describes where the variable is defined. That is the answer to the question. — Servy, Dec 16 '21 at 13:59
@Tim This question does not ask how to choose what the name is, it asks the mechanism by which one defines it. It asks where the place the name is selected, nothing about what it should be. The duplicate answers exactly that. If this question *did* ask what you *should* name something, it would be asking for opinions and would merit closure for that reason. But that's not what it asks. It asks where the name is decided on, not why the programmer who named it chose what they did. — Servy, Dec 16 '21 at 14:01
Fair point, I muddied the waters by mentioning suitability of the name, sorry for doing that. On the question of being a dupe, I suppose I think that because the dupes are not specific to the naming of the variable, it's less clear for a beginner how the program knows where the name comes from. You presumably disagree though and think it is clear enough, and I'm happy to accept that as a difference of opinion. If others want to vote to reopen they can, but I'm content I've put my point across as well as I'm able and if there is no consensus to overturn your closure then I accept that. — Tim, Dec 16 '21 at 14:08
@Tim How is it not clear where the name comes from? There are 3 answers all stating where the name comes from. Explicitly. That they state that information isn't a matter of opinion. It's a statement of fact. — Servy, Dec 16 '21 at 15:03
@Servy You're right, I was asking where the name is decided on, as this chunk of code wasn't my doing. Might have worded my question a little odd. But going through most answers was super helpful and now I know the first mention of "name" is of course the moment it is declared and **by context** is of type string. — Garrag, Dec 16 '21 at 23:01
@Garrag Which of course you'd also know if you read the duplicate. That people just repeated the duplicates to you doesn't make the question not a duplicate. It means people just care more about re-posting the same answers over and over. — Servy, Dec 16 '21 at 23:42

Caius Jard · Accepted Answer · 2021-12-16T15:53:03.493

I understand that the same string[] that comes as an argument is refilled with elements and returned

I'll address this briefly, because it's not the question you're asking. This isn't true - no refilling of anything occurs. The original array is unaltered and a Where is a loop that runs over the array and selectively emits items from it when the test presented to it evaluates to true for that item. The way it does this is via a special construct called a yield return which is a way of allowing code to return from a method and then re-enter it and carry on from where it left off before, rather than starting all over again from the beginning of the method. There is only ever one array, and the looping/testing is not performed unless you start reading from the set of strings produced by the Where. If you want to know more about that, drop a comment.

Moving on..

Does linq know that a variable with a singular name is an element of a plural name group, here “names”?

No; the IDE knows the variable name because that's what you chose to call it just after the Where(

Perhaps it would help to link it to something you already know. It would be perfectly acceptable to write this code:

public static class Kata {
  public static IEnumerable<string> FriendOrFoe (string[] names) {
    return names.Where(IsNameOfLengthFour);
  }

  static bool IsNameOfLengthFour(string name){
    return name.Length == 4;
  }
}

Where demands some method be supplied that takes a string and returns a boolean. It demands that the input be a string because it's being called on names which is an array of string. If it were ints in the array, the method passed to Where would have to take an int

In C# there's often a push to make things more compact, so to get rid of all that wordiness above, we have a much more compact form of writing method bodies. Let's reduce our wordy version:

static bool IsNameOfLengthFour(string name){
  return name.Length == 4;
}

We can chuck out the return type, because we can guess that from the type being returned. We can get rid of static too, and just assume static if we're calling it from inside another static, or not if we're not. We can ditch the input type too, because we can guess that from the type going in.

IsNameOfLengthFour(name){
  return name.Length == 4;
}

If we have a special syntx that is only one line and must be a value that is auto returned, we can get rid of the return, and the {} because it's only one line, so we don't need to fence off multiple statements:

IsNameOfLengthFour(name) => name.Length == 4

Now, we don't actually need a method name any more either if we're going to use this in some place where a name is irrelevant, and we really don't need () for a single argument either:

name => name.Length == 4

And that's enough of an expression for the compiler to be able to form a method out of it, and plumb it into something that expects a method taking a string and returning a boolean. We've thrown away all the fluff of a method that we humans like (names and identifiers) and given the compiler just the raw nuts and bolts it needs - the logic of the method. The compiler will recreate the rest of the fluff when it wires it all together for us; we won't ever be able to call this mini-method from elsewhere in our code but we don't care. We got what we wanted, which is a nice compact way of expressing the logic:

Where(n => n.Length==4);

You did a good job, calling the argument to this mini-method something sensible. I see x used a lot and it gets really confusing when what X is changes.. For example:

names
  .Where(name => ...)
  .GroupBy(name => ...)
  .Select(g => g.First())
  .Where(name => ...)

Where works on your array of names so calling the argument to the delegate name or n is a good idea. Where will filter it down but ultimately it still emits a set of strings that are names, so it's still a good idea to call it name on the way into a GroupBy.. But a GroupBy produces a set of IGrouping, not a set of string so the thing coming our of a GroupBy is no longer a name.. In the next Select I call it g to reflect that it's a grouping, not a name, but I then take the first item in the grouping which is, actually, a name.. So in the final Where I go back to calling the input argument name to reflect what it's back to being..

When LINQ statements get much more complicated, it really helps to name these arguments well

_{Note: in this answer I've used words like "list" or "set" and I mean those in the general English sense that an "array is a list of ..", not a specifically C# List<xxx> or HashSet<xxx> sense. If you see lowercase words that align with C# types, they are not intended to refer to that specific type}

You're really fudging here -- an actual method with syntactic sugar shortening is not the same as a Func<> or Predicate<>. — Stu, Feb 15 '22 at 23:25

crashmstr · Answer 2 · 2022-02-15T22:15:09.707

0

I understand that the same string[] that comes as an argument is refilled with elements and returned.

No, absolutely not. A new sequence of strings is returned based on the input array. The input array is not modified or changed in any way.

But how does the program know that each element of the array is called “name”, as it’s never mentioned before?

name is a parameter to an anonymous function. The name parameter is a string based on context. This could be x or ASDASDASD or whatever you want, but here we use name since we have, on each call, one "name" from names.

Thus,

names is an array of strings passed into the function
the .Where returns a new IEnumerable<string> from the current array based on a predicate function (e.g. returns true for a match, false to omit)
The predicate name => name.Length == 4 takes a string and returns true if the string is length 4
The return from the function is the strings from names that are exactly 4 characters in length

edited Feb 15 '22 at 22:15

answered Dec 16 '21 at 15:03

crashmstr

28,043
9
61
79

2

*A completely new array is returned based on the input array* - That's not actually true – Caius Jard Dec 16 '21 at 15:36
I don't think it is a good idea to switch terms or use non-canonical terms for the same constructs in an answer for a beginner (e.g. "anonymous function", "predicate"). Also `.Where` never returns an array, so that is just wrong. – NetMage Dec 16 '21 at 20:19
1

Thanks, I think 'The name parameter is a string based on context' is all I was asking for. The .Where is checking the array it knows is of string and the first mention of name is the declaration. Check! – Garrag Dec 16 '21 at 23:22
1

@Garrag Probably also worth noting that there are two versions of Where. One is `Where(list_of_strings, function_that_takes_a_string_and_returns_a_bool)` and the other is `Where(list_of_strings, function_that_takes_a_string_plus_int_and_returns_a_bool)`. The first argument represents the list the Where is called on (because Where is an extension method). The second is the function. You've used the single argument version, so you don't use any `( )` and just write a name. If you used the second you would supply e.g. `.Where((str, idx) => ...)`, the str being the string, and idx being its index – Caius Jard Dec 17 '21 at 12:32
1

..within the list. This can be useful for example if you want to return based on where an item is.. `Where((str, idx) => idx%2==0 && str.StartsWith("A"))` only returns every other string, and only if it starts with A. Remember always that this is just a mini method declaration, and you write `(str, idx)` just like you would write `public bool GetEveryOtherStartingWithA(string str, int idx){ ...` – Caius Jard Dec 17 '21 at 12:35

How does Linq know the name of an element?

2 Answers2