7

Can someone explain to me why the EF Engine is failing in the following scenario?

It works fine with the following expression:

var data = context.Programs
    .Select(d => new MyDataDto
    {
        ProgramId = d.ProgramId,
        ProgramName = d.ProgramName,
        ClientId = d.ClientId,
        Protocols = d.Protocols.Where(p => p.UserProtocols.Any(u => u.UserId == userId))
                .Count(pr => pr.Programs.Any(pg => pg.ProgramId == d.ProgramId))
    })
    .ToList();

But if I encapsulate some into an extension method:

public static IQueryable<Protocol> ForUser(this IQueryable<Protocol> protocols, int userId)
{
    return protocols.Where(p => p.UserProtocols.Any(u => u.UserId == userId));
}

The resulting query:

var data = context.Programs
    .Select(d => new MyDataDto
    {
        ProgramId = d.ProgramId,
        ProgramName = d.ProgramName,
        ClientId = d.ClientId,
        Protocols = d.Protocols.ForUser(userId)
                .Count(pr => pr.Programs.Any(pg => pg.ProgramId == d.ProgramId))
    })
    .ToList();

Fails with the exception: LINQ to Entities does not recognize the method 'System.Linq.IQueryable1[DAL.Protocol] ForUser(System.Linq.IQueryable1[DAL.Protocol], Int32)' method, and this method cannot be translated into a store expression.

I would expect the EF Engine to build the entire expression tree, chaining the necessary expressions and then generate the SQL. Why doesn't it do that?

Jonesopolis
  • 25,034
  • 12
  • 68
  • 112
MihaiP.
  • 133
  • 2
  • 10
  • See http://stackoverflow.com/a/36736907/1625737 and http://stackoverflow.com/a/29960775/1625737, not sure it would work with extension-methods though – haim770 Sep 21 '16 at 18:20
  • Thank you for the tip. I've started looking down that road but it will take me a while to digest all that about expressions and functions. Meanwhile, is there a way to re-write my extension method like an Expression> (T probably being Protocol here) so it will work with my intended usage? – MihaiP. Sep 21 '16 at 18:31
  • I'm not sure LinqKit would be able to help here. I think @StriplingWarrior answer is the most sensible approach. – haim770 Sep 21 '16 at 18:36
  • `d.Protocols` seems to be a navigation property, hence cannot be `IQueryable` (most likely `ICollection`), right? – Ivan Stoev Sep 21 '16 at 19:35
  • Does EF6 have support for compiled queries (or equivalent)? LINQ to SQL and older versions of EF would and allow you to create such a method. Otherwise, you might be able to use [linqkit](http://www.albahari.com/nutshell/linqkit.aspx). – Jeff Mercado Sep 22 '16 at 16:56

2 Answers2

9

This is happening because the call to ForUser() is being made inside of the expression tree that the C# compiler builds when it sees the lambda you pass into Select. Entity Framework tries to figure out how to convert that function into SQL, but it can't invoke the function for a few reasons (e.g. d.Protocols does not exist at the moment).

The simplest approach that works for a case like this is to have your helper return a criteria lambda expression, and then pass that into the .Where() method yourself:

public static Expression<Func<Protocol, true>> ProtocolIsForUser(int userId)
{
    return p => p.UserProtocols.Any(u => u.UserId == userId);
}

...

var protocolCriteria = Helpers.ProtocolIsForUser(userId);
var data = context.Programs
    .Select(d => new MyDataDto
    {
        ProgramId = d.ProgramId,
        ProgramName = d.ProgramName,
        ClientId = d.ClientId,
        Protocols = d.Protocols.Count(protocolCriteria)
    })
    .ToList();

More information

When you invoke a LINQ method outside of an expression tree (like you do with context.Programs.Select(...)), the Queryable.Select() extension method actually gets invoked, and its implementation returns an IQueryable<> that represents the extension method getting called on the original IQueryable<>. Here's the implementation of Select, for instance:

    public static IQueryable<TResult> Select<TSource,TResult>(this IQueryable<TSource> source, Expression<Func<TSource, TResult>> selector) {
        if (source == null)
            throw Error.ArgumentNull("source");
        if (selector == null)
            throw Error.ArgumentNull("selector");
        return source.Provider.CreateQuery<TResult>( 
            Expression.Call(
                null,
                GetMethodInfo(Queryable.Select, source, selector),
                new Expression[] { source.Expression, Expression.Quote(selector) }
                ));
    }

When the queryable's Provider has to generate actual data from the IQueryable<>, it analyzes the expression tree and tries to figure out how to interpret those method calls. Entity Framework has built-in knowledge of many LINQ-related functions like .Where() and .Select(), so it knows how to translate those method calls into SQL. However, it doesn't know what to do for methods that you write.

So why does this work?

var data = context.Programs.ForUser(userId);

The answer is that your ForUser method is not implemented like the Select method above: you are not adding an expression to the queryable to represent calling ForUser. Instead, you are returning the result of a .Where() call. From the IQueryable<>'s perspective, it's as if Where() was called directly, and the call to ForUser() never happened.

You can prove this by capturing the Expression property on the IQueryable<>:

Console.WriteLine(data.Expression.ToString());

... which will produce something like this:

Programs.Where(u => (u.UserId == value(Helpers<>c__DisplayClass1_0).userId))

There's no call to ForUser() anywhere in that expression.

On the other hand, if you include the ForUser() call inside of an expression tree like this:

var data = context.Programs.Select(d => d.Protocols.ForUser(id));

... then the .ForUser() method never actually gets invoked, so it never returns an IQueryable<> that knows the .Where() method got called. Instead, the expression tree for the queryable shows .ForUser() getting invoked. Outputting its expression tree would look something like this:

Programs.Select(d => d.Protocols.ForUser(value(Repository<>c__DisplayClass1_0).userId))

Entity Framework has no idea what ForUser() is supposed to do. As far as it's concerned, you could have written ForUser() to do something that's impossible to do in SQL. So it tells you that's not a supported method.

StriplingWarrior
  • 151,543
  • 27
  • 246
  • 315
  • Thank you for your answer. Please let me try to rephrase a bit my question. What I don't understand is why is the engine differentiating between Linq's _Where_ extension method and my _ForUser_ extension method? I would expect it to recursively parse the extension methods chaining, build the complete Expression Tree and then generate the SQL. You are saying that d.Protocols does not exist yet. But can't it be inferred like "all Protocols for my Program d" which would be translated in a subquery? – MihaiP. Sep 21 '16 at 19:50
  • Take a look at (google) tree-expression builder. Just as StriplingWarrior mentiond the builder says it wil evaluate it whyile executing. If i remember correctly, EF 7 has the tree builder a bit different and evaluate it before it executes it. – Roelant M Sep 21 '16 at 20:03
  • @MihaiP.: I added a much more detailed explanation. Let me know if that makes sense. – StriplingWarrior Sep 21 '16 at 20:37
  • It becomes even more clear if you convert the extension methods to static method calls: `var data = MyExtensions.ForUser(context.Programs, userId);` works because the statement starts with the `ForUser` call and it simply returns an `IQueryable`. In the other cases, `MyExtensions.ForUser(...)` is an "alien" CLR method call inside an expression. – Gert Arnold Sep 21 '16 at 20:48
  • @StriplingWarrior: I really appreciate the time you took to provide such a detail explanation. In the end we can't tell why exactly the EF Team has decided for this implementation and far from me to assert I know better than them. One reason could be efficiency of the resulted query as the sub-queries might not be properly optimized by different database engines. I'll post below a different way of writing that query which satisfies in the end my requirement: use those extension methods. – MihaiP. Sep 22 '16 at 11:55
  • 1
    @MihaiP. At the end it's crystal clear that what are you asking for is impossible to be implemented. The EF team decision has nothing to do with efficiency and subqueries. Read carefully StriplingWarrior answer and think logically. The key point is that `Queryable` extension methods are **not executed** (same for `DbFunctions` and `SqlFunctions`). They are just known signatures like contract (interface) - name and arguments. EF is actually implementing them in SQL. But of course no one knows your methods, neither can decompile what they do, so obviously they cannot be translated. – Ivan Stoev Sep 22 '16 at 15:06
  • @MihaiP.: Suppose for a moment that you put an `if` statement in your `ForUser` extension method, which checked to see whether the `protocols` query had any items in it, and returned a different value (or threw an exception) if it didn't. In order for `ForUser` to function at that point, the `protocols` query would need to be backed by a real query that's capable of being run. That cannot happen in a SQL statement, and yet for any given `Program` you don't know what its `Protocols` are until you're running the query on the database. – StriplingWarrior Sep 22 '16 at 15:39
0

As I mentioned in my comment above, I can't tell why the EF Engine is working the way it is. Therefore, I've tried to find a way to re-write the query so I'll be able to make use of my extension methods.

The tables are:

Program -> 1..m -> ProgramProtocol -> m..1 -> Protocol

ProgramProtocol is just a join table and is not mapped in the model by Entity Framework. The idea is simple: select "from left", select "from right" and then join the resulted sets for proper filtering:

var data = context.Programs.ForUser(userId)
    .SelectMany(pm => pm.Protocols,
        (pm, pt) => new {pm.ProgramId, pm.ProgramName, pm.ClientId, pt.ProtocolId})
    .Join(context.Protocols.ForUser(userId), pm => pm.ProtocolId,
        pt => pt.ProtocolId, (pm, pt) => pm)
    .GroupBy(pm => new {pm.ProgramId, pm.ProgramName, pm.ClientId})
    .Select(d => new MyDataDto
    {
        ProgramName = d.Key.ProgramName,
        ProgramId = d.Key.ProgramId,
        ClientId = d.Key.ClientId,
        Protocols = d.Count()
    })
    .ToList();
MihaiP.
  • 133
  • 2
  • 10
  • 1
    Who's "we"? I explained in great detail why the EF Engine is working the way it is--why it can't possibly work the way you want it to. If you haven't understood then that's because you haven't taken the time to think through my explanation. Your solution should work, but it seems a lot more complicated than it needs to be, and the only thing it saves you is a call to `.Where()`. – StriplingWarrior Sep 22 '16 at 15:53
  • @StriplingWarrior - I edited my answer to remove any unintended associations with other people. But maybe it was the royal "we"... :) – MihaiP. Sep 24 '16 at 01:05