How do Expression Trees provide access to local variables?

Question

This question is not related to how closures work. This question is about how LINQ decides what to quote into a runtime parsable expression, and what to evaluate and put into that expression.

This question is trying to understand how LINQ works, to implement something similar in another language.

Consider the following LINQ query, to be converted into an expression tree:

var my_variable = "abc";
var qry = from x in source.Foo
          where x.SomeProp == my_variable
          select x.Bar;

which is mapped by the compiler into code:

var qry = source.Foo
           .Where(x => x.SomeProp == my_variable)
           .Select(x => x.Bar);

When this is converted to an expression tree, how does LINQ know which to quote into expressions, and which to evaluate and put results into the expressions?

For example, how does it know to evaluate my_variable and put the result into the expressino, but convert x.SomeProp and == into parts of the LINQ Expression Tree?

Does the C# compiler have a hard-coded special list of which expressions are quoted for LINQ? (i.e. the outer-most operations which can be translated into SQL, including: ==, !=, &&, ||, <, >, <=, >=, substring, etc)

This is not a question about how closures capture variables, but a question about how LINQ knows which expressions to capture vs encode into expressions. — David Jeske, May 22 '17 at 19:07
LINQ doesn't close over anything *ever*. You're simply passing in lambdas to the LINQ calls, and it is those lambdas *which have nothing at all to do with LINQ* that are creating closures. If you want to know how lambdas create closures, you have the duplicate. If you don't, then there's nothing that your question is asking. Saying that your question isn't a duplicate *when you're asking exactly the same thing as the duplicate* isn't meaningful. — Servy, May 24 '17 at 18:13
Perhaps my use of the word 'capture' was confusing... LINQ implementations are able to see both unevaluated expresions (like +) and the results of local variables and function calls. this is what i'm asking about. — David Jeske, May 25 '17 at 19:28
*None* of the expressions are going to be evaluated at the time the expression is created, and *all* of them can be evaluated at any time by any consumer of the expression. This *also* has nothing to do with LINQ. The lambda is going to have all of the information about what the code is, and the actual query provider may (if it chooses to) evaluate some or all of the expressions to values, as they see fit. LINQ just takes the expressions and provides them to the query provider. — Servy, May 25 '17 at 19:39
I see my confusion and why this is not related to LINQ but only to Expression trees. I was thinking Expressions trees were merely an AST (a structured representation of the text of the lambda), but they also contain references to the live program objects. In the case of a captured variable that appears in a lambda, the Expression tree contains a MemberExpression/ConstantExpression pair which can access the box of the variable. — David Jeske, May 28 '17 at 03:32
this is the answer to my question .. https://stackoverflow.com/questions/6998523/how-to-get-the-value-of-a-constantexpression-which-uses-a-local-variable — David Jeske, May 28 '17 at 03:33

PhillipH · Answer 1 · 2017-05-25T20:13:01.423

Just by examination its clear that .SomeProp is dependent on x and x is not deterministic at expression tree parsing time, because its an output of a prior function (where).

my_variable is not an output of any expression in the expression tree so its value might be known at expression tree parsing time, but its quite likely not to be 'baked in' even if it is known because that would prevent the compiled expression tree from being reused, so it will just be treated as an input value into the expression tree evaluation.

I haven't decompiled linq, but you could consider the following expression tree;

ExpressionTree myEx = new ExpressionTree(
   new MultiplicationExpression(
      new InputVariableExpression("@MyInputVar"),
      new ConstantExpression(22)
   )
);

To evaluate you might call

Dictionary<string, object> inputVars = new Dictionary<string, object>();
inputVars.Add("@MyInputVar",16);
int result = myEx.Evaluate(inputVars);

The parser might choose to bake in the constant expression, because it 'knows' it cannot change, but consider the following;

ExpressionTree anotherEx = new ExpressionTree(
   new AdditionExpression(
      myEx,
      new InputVariableExpression("@MyNextInputVar")
   )
);

This is similar in concept to using a replacement variable in Linq x => where myEx is a stored expression tree, but not actually the result of the expression. The expression parser cannot independently know what the value of myEx is until execution time.

Dictionary<string, object> inputVars = new Dictionary<string, object>();
inputVars.Add("@MyInputVar",16);
inputVars.Add("@MyNextInputVar",45);
int result = anotherEx.Evaluate(inputVars);

So this execution code will inherently evaluate myEx during the evaluation of anotherEx. If myEx had only ConstantExpression element in it, it might be evaluated only once and the result cached, but because it contains an out-of-scope InputVariableExpression its clear the result of one evaluation cannot be cached for subsequent use.

This is the closest answer to my question. Are you suggesting LINQ has some kind of hard-coded evaluation rules for what it evaluates and quotes the result, and what it quotes without evaluating? — David Jeske, May 25 '17 at 19:12
I've never decompiled Linq, but I have written my own expression tree parsing and compiling code; and yes - I examined the inputs and if they were constant I baked them in, but if they were possibly variable I didn't. This allowed me to reuse the compiled expression later and skip the compilation/parsing step. I'll edit my response to put a more detailed expanation in. — PhillipH, May 25 '17 at 20:03
I wonder what the downvote was for ? Can't please everyone I guess :0) — PhillipH, May 25 '17 at 20:14
it turns out that the "magic" is that Expression trees are not just passive ASTs, they are also linked to the running program state... So you can either look at the textual AST info like the variable names, or you can evaluate the expressions as they relate to the live running program. — David Jeske, May 28 '17 at 03:36

score -1 · Answer 2 · answered May 21 '17 at 07:59

-1

This is fairly easy to test:

int i = 1;
Func<int> func = () => i;
i = 2;
Console.WriteLine(func.Invoke());

That prints out 2, which tells us that it stores the symbol and does not evaluate it until the function is evaluated.

answered May 21 '17 at 07:59

Unlocked

648
6
19

As a note, here's a nice place to test out little code fragments like this: https://csharppad.com/ – Unlocked May 21 '17 at 08:00
1

That doesn't answer his question of `how does LINQ capture the value of my_variable` – Mardoxx May 21 '17 at 09:26
Interesting! How does Invoke get the value of I? By crawling up the stack using reflection? That wouldn't work if you returned the func and later evaluated. Does it force the local "int i=1" to be boxed? – David Jeske May 21 '17 at 17:10
It appears to keep a pointer to `i` even when `i` falls out of scope. It's less of what Invoke does and more of what func is. Basically everything in C# is a pointer (citation needed), so the function doesn't need to know the value of `i`, just where to look to find it. As far as I know, LINQ uses the same function system as the rest c#. Input predicates to LINQ functions do not need to be anonymous, they just usually are because it's faster to write it that way. – Unlocked May 22 '17 at 05:26

How do Expression Trees provide access to local variables?

2 Answers2

Linked

Related