16

While trying to organize some data access code using EF Core I noticed that the generated queries were worse than before, they now queried columns that were not needed. The basic query is just selecting from one table and mapping a subset of columns to a DTO. But after rewriting it now all columns are fetched, not just the ones in the DTO.

I created a minimal example with some queries that show the problem:

ctx.Items.ToList();
// SELECT i."Id", i."Property1", i."Property2", i."Property3" FROM "Items" AS i


ctx.Items.Select(x => new
{
  Id = x.Id,
  Property1 = x.Property1
}
).ToList();
// SELECT i."Id", i."Property1" FROM "Items" AS i

ctx.Items.Select(x => new MinimalItem
{
  Id = x.Id,
  Property1 = x.Property1
}
).ToList();
// SELECT i."Id", i."Property1" FROM "Items" AS i

ctx.Items.Select(
  x => x.MapToMinimalItem()
).ToList();
// SELECT i."Id", i."Property1", i."Property2", i."Property3" FROM "Items" AS i

ctx.Items.Select(
  x => new MinimalItem(x)
).ToList();

// SELECT i."Id", i."Property1", i."Property2", i."Property3" FROM "Items" AS i

The objects are defined like this:

  public class Item
  {
    public int Id { get; set; }
    public string Property1 { get; set; }
    public string Property2 { get; set; }
    public string Property3 { get; set; }

  }

  public class MinimalItem
  {
    public MinimalItem() { }

    public MinimalItem(Item source)
    {
      Id = source.Id;
      Property1 = source.Property1;
    }
    public int Id { get; set; }
    public string Property1 { get; set; }
  }

  public static class ItemExtensionMethods
  {
    public static MinimalItem MapToMinimalItem(this Item source)
    {
      return new MinimalItem
      {
        Id = source.Id,
        Property1 = source.Property1
      };
    }
  }

The first query queries all columns as intended, and the second query with an anonymous object only queries the selected queries, that works all fine. Using my MinimalItem DTO also works as long as it is created directly in the Select method. But the last two queries fetch all columns even though they do exactly the same thing as the third query, just moved to a constructor or an extension method, respectively.

Obviously EF Core can't follow this code and determine that it only needs the two columns if I move it out of the Select method. But I'd really like to do that to be able to reuse the mapping code, and make the actual query code easier to read. How can I extract this kind of straightforward mapping code without making EF Core inefficiently fetching all columns all the time?

Mad Scientist
  • 18,090
  • 12
  • 83
  • 109
  • You queries are returning the entire class ITEM. – jdweng May 31 '20 at 11:45
  • @jdweng the last two return a MinimalItem in the Select. Just like the third one that works. This is the signature VS Code shows me for the `toList()` in the last two queries: `List IEnumerable.ToList()` – Mad Scientist May 31 '20 at 11:48
  • If you want only some of the columns then you need a select to specify which columns to return. You did not have a select so ALL columns are being returned. Look at following which return some of the columns : SELECT i."Id", i."Property1" – jdweng May 31 '20 at 11:54
  • (1) What EF Core version are you using? (2) Are you allowed to use 3rd party open source packages? – Ivan Stoev Jun 01 '20 at 09:29
  • @IvanStoev The most recent .NET Core 3.1. I would prefer to avoid using a third-party package if possible for this kind of core concern, especially something more complex and "magical" like Automapper. But I'm not categorically against using a package for this. – Mad Scientist Jun 01 '20 at 11:40
  • @MadScientist I wouldn't ask you if there was out of the box solution. The package I had in mind (see my answer below) really does some magic - probably a good candidate to be embedded in EF Core. – Ivan Stoev Jun 01 '20 at 18:05
  • There is a Github issue suggesting to disable this behavior: https://github.com/dotnet/efcore/issues/24509 – sveinungf May 26 '22 at 07:50

2 Answers2

16

This is fundamental problem with IQueryable from the very beginning, with no out of the box solution after so many years.

The problem is that IQueryable translation and code encapsulation/reusability are mutually exclusive. IQueryable translation is based on knowledge in advance, which means the query processor must be able to "see" the actual code, and then translate the "known" methods/properties. But the content of the custom methods / calculable properties is not visible at runtime, so query processors usually fail, or in limited cases where they support "client evaluation" (EF Core does that only for final projections) they generate inefficient translation which retrieves much more data than needed like in your examples.

To recap, neither C# compiler nor BCL helps solving this "core concern". Some 3rd party libraries are trying to address it in different level of degree - LinqKit, NeinLinq and similar. The problem with them is that they require refactoring your existing code additionally to calling a special method like AsExpandable(), ToInjectable() etc.

Recently I found a little gem called DelegateDecompiler, which uses another package called Mono.Reflection.Core to decompile method body to its lambda representation.

Using it is quite easy. All you need after installing it is to mark your custom methods / computed properties with custom provided [Computed] or [Decompile] attributes (just make sure you use expression style implementation and not code blocks), and call Decompile() or DecompileAsync() custom extension method somewhere in the IQueryable chain. It doesn't work with constructors, but all other constructs are supported.

For instance, taking your extension method example:

public static class ItemExtensionMethods
{
    [Decompile] // <--
    public static MinimalItem MapToMinimalItem(this Item source)
    {
        return new MinimalItem
        {
            Id = source.Id,
            Property1 = source.Property1
        };
    }
}

(Note: it supports other ways of telling which methods to decompile, for instance all methods/properties of specific class etc.)

and now

ctx.Items.Decompile()
    .Select(x => x.MapToMinimalItem())
    .ToList();

produces

// SELECT i."Id", i."Property1" FROM "Items" AS i

The only problem with this approach (and other 3rd party libraries) is the need of calling custom extension method Decompile, in order to wrap the queryable with custom provider just to be able to preprocess the final query expression.

It would have been nice if EF Core allow plugging custom query expression preprocessor in its LINQ query processing pipeline, thus eliminating the need of calling custom method in each query, which could easily be forgotten, and also custom query providers does not play well with EF Core specific extensions like AsTracking, AsNoTracking, Include/ ThenInclude, so it should really be called after them etc.


Update (EF Core 7.0+):

EF Core 7.0 finally added Interception to modify the LINQ expression tree capability, so now the plumbing code is reduced to

using Microsoft.EntityFrameworkCore.Query;
using Microsoft.EntityFrameworkCore.Diagnostics;

namespace Microsoft.EntityFrameworkCore
{
    public static class DelegateDecompilerDbContextOptionsBuilderExtensions
    {
        public static DbContextOptionsBuilder AddDelegateDecompiler(this DbContextOptionsBuilder optionsBuilder)
            => optionsBuilder.AddInterceptors(new DelegateDecompilerQueryPreprocessor());
    }
}

namespace Microsoft.EntityFrameworkCore.Query
{
    using System.Linq.Expressions;
    using DelegateDecompiler;
    public class DelegateDecompilerQueryPreprocessor : IQueryExpressionInterceptor
    {
        Expression IQueryExpressionInterceptor.QueryCompilationStarting(Expression queryExpression, QueryExpressionEventData eventData)
            => DecompileExpressionVisitor.Decompile(queryExpression);
    }
}

Original:

Currently there is an open issue Please open the query translation pipeline for extension #19748 where I'm trying to convince the team to add an easy way to add expression preprocessor. You can read the discussion and vote up.

Until then, here is my solution for EF Core 3.1:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
using Microsoft.EntityFrameworkCore.Infrastructure;
using Microsoft.EntityFrameworkCore.Query;
using Microsoft.Extensions.DependencyInjection;

namespace Microsoft.EntityFrameworkCore
{
    public static partial class CustomDbContextOptionsExtensions
    {
        public static DbContextOptionsBuilder AddQueryPreprocessor(this DbContextOptionsBuilder optionsBuilder, IQueryPreprocessor processor)
        {
            var option = optionsBuilder.Options.FindExtension<CustomOptionsExtension>()?.Clone() ?? new CustomOptionsExtension();
            if (option.Processors.Count == 0)
                optionsBuilder.ReplaceService<IQueryTranslationPreprocessorFactory, CustomQueryTranslationPreprocessorFactory>();
            else
                option.Processors.Remove(processor);
            option.Processors.Add(processor);
            ((IDbContextOptionsBuilderInfrastructure)optionsBuilder).AddOrUpdateExtension(option);
            return optionsBuilder;
        }
    }
}

namespace Microsoft.EntityFrameworkCore.Infrastructure
{
    public class CustomOptionsExtension : IDbContextOptionsExtension
    {
        public CustomOptionsExtension() { }
        private CustomOptionsExtension(CustomOptionsExtension copyFrom) => Processors = copyFrom.Processors.ToList();
        public CustomOptionsExtension Clone() => new CustomOptionsExtension(this);
        public List<IQueryPreprocessor> Processors { get; } = new List<IQueryPreprocessor>();
        ExtensionInfo info;
        public DbContextOptionsExtensionInfo Info => info ?? (info = new ExtensionInfo(this));
        public void Validate(IDbContextOptions options) { }
        public void ApplyServices(IServiceCollection services)
            => services.AddSingleton<IEnumerable<IQueryPreprocessor>>(Processors);
        private sealed class ExtensionInfo : DbContextOptionsExtensionInfo
        {
            public ExtensionInfo(CustomOptionsExtension extension) : base(extension) { }
            new private CustomOptionsExtension Extension => (CustomOptionsExtension)base.Extension;
            public override bool IsDatabaseProvider => false;
            public override string LogFragment => string.Empty;
            public override void PopulateDebugInfo(IDictionary<string, string> debugInfo) { }
            public override long GetServiceProviderHashCode() => Extension.Processors.Count;
        }
    }
}

namespace Microsoft.EntityFrameworkCore.Query
{
    public interface IQueryPreprocessor
    {
        Expression Process(Expression query);
    }

    public class CustomQueryTranslationPreprocessor : RelationalQueryTranslationPreprocessor
    {
        public CustomQueryTranslationPreprocessor(QueryTranslationPreprocessorDependencies dependencies, RelationalQueryTranslationPreprocessorDependencies relationalDependencies, IEnumerable<IQueryPreprocessor> processors, QueryCompilationContext queryCompilationContext)
            : base(dependencies, relationalDependencies, queryCompilationContext) => Processors = processors;
        protected IEnumerable<IQueryPreprocessor> Processors { get; }
        public override Expression Process(Expression query)
        {
            foreach (var processor in Processors)
                query = processor.Process(query);
            return base.Process(query);
        }
    }

    public class CustomQueryTranslationPreprocessorFactory : IQueryTranslationPreprocessorFactory
    {
        public CustomQueryTranslationPreprocessorFactory(QueryTranslationPreprocessorDependencies dependencies, RelationalQueryTranslationPreprocessorDependencies relationalDependencies, IEnumerable<IQueryPreprocessor> processors)
        {
            Dependencies = dependencies;
            RelationalDependencies = relationalDependencies;
            Processors = processors;
        }
        protected QueryTranslationPreprocessorDependencies Dependencies { get; }
        protected RelationalQueryTranslationPreprocessorDependencies RelationalDependencies { get; }
        protected IEnumerable<IQueryPreprocessor> Processors { get; }
        public QueryTranslationPreprocessor Create(QueryCompilationContext queryCompilationContext)
            => new CustomQueryTranslationPreprocessor(Dependencies, RelationalDependencies, Processors, queryCompilationContext);
    }
}

You don't need to understand that code. Most (if not all) of it is a boilerplate plumbing code to support the currently missing IQueryPreprocessor and AddQueryPreprocesor (similar to recently added interceptors). I'll update it if EF Core adds that functionality in the future.

Now you can use it to plug the DelegateDecompiler into EF Core:

using System.Linq.Expressions;
using Microsoft.EntityFrameworkCore.Query;
using DelegateDecompiler;

namespace Microsoft.EntityFrameworkCore
{
    public static class DelegateDecompilerDbContextOptionsExtensions
    {
        public static DbContextOptionsBuilder AddDelegateDecompiler(this DbContextOptionsBuilder optionsBuilder)
            => optionsBuilder.AddQueryPreprocessor(new DelegateDecompilerQueryPreprocessor());
    }
}

namespace Microsoft.EntityFrameworkCore.Query
{
    public class DelegateDecompilerQueryPreprocessor : IQueryPreprocessor
    {
        public Expression Process(Expression query) => DecompileExpressionVisitor.Decompile(query);
    }
}

A lot of code just to be able to call

DecompileExpressionVisitor.Decompile(query)

before EF Core processing, but it is what it is.


Now all you need is to call

optionsBuilder.AddDelegateDecompiler();

in your derived context OnConfiguring override, and all your EF Core LINQ queries will be preprocessed and decompiled bodies injected.

With you examples

ctx.Items.Select(x => x.MapToMinimalItem())

will automatically be converted to

ctx.Items.Select(x => new
{
    Id = x.Id,
    Property1 = x.Property1
}

thus translated by EF Core to

// SELECT i."Id", i."Property1" FROM "Items" AS I

which was the goal.

Additionally, composing over projection also works, so the following query

ctx.Items
    .Select(x => x.MapToMinimalItem())
    .Where(x => x.Property1 == "abc")
    .ToList();

originally would have generated runtime exception, but now translates and runs successfully.

Ivan Stoev
  • 195,425
  • 15
  • 312
  • 343
  • Couldn't the time it takes to decompile lead to performance problems? – MrDave1999 Jul 27 '22 at 02:49
  • @MrDave1999 Sure it adds some time to the query processing pipeline. But I don't think it is eligible for causing performance issues since the most of the time is in db query execution and reading/transforming the result. – Ivan Stoev Jul 27 '22 at 05:56
  • In EF Core 7.0, it throws the following exception: `System.TypeLoadException` and the problem lies in the signature of the `GetServiceProviderHashCode` method (see this [issue](https://github.com/MrDave1999/EFCore.CustomQueryPreprocessor/issues/1)). Some time ago I created a package with your solution and published a package to solve this problem (see [release](https://github.com/MrDave1999/EFCore.CustomQueryPreprocessor/releases/tag/2.0.0)). – MrDave1999 Dec 09 '22 at 01:13
  • @MrDave1999 Nice, thanks! However, looks like in EF Core 7.0 they added [Interception to modify the LINQ expression tree #28505](https://github.com/dotnet/efcore/issues/28505) capability "out of the box", so probably most of this custom code is not needed anymore. Haven't time to check it out, will update the answer when got some free time. – Ivan Stoev Dec 09 '22 at 08:48
  • 1
    @MrDave1999 Answer updated. As expected, the custom code needed is much simpler. – Ivan Stoev Dec 11 '22 at 06:41
  • Note for newcomers like me. The EF Core 7 solution just automatically adds Decompile()/DecompileAsync() to the queries. [Decompile] must be still placed in the mapping/projection function – user3625699 Jan 27 '23 at 23:00
3

Entity Framework does not know anything about your MapToMinimalItem method and how to translate it into SQL, so it fetches whole entity and performs the Select on the client side.

If you take a closer look at the EF LINQ method signatures, you will see, that IQueryable operates with Expression's of Func (Select for example) instead of Funcs as it's IEnumerable counterpart, so underlying provider could analyze the code and generate what is needed (SQL in this case).

So if you want to move the projection code into separate method this method should return Expression, so EF could transform it into SQL. For example:

public static class ItemExtensionMethods
{
    public static readonly Expression<Func<Item, MinimalItem>> MapToMinimalItemExpr = 
        source => new MinimalItem
        {
            Id = source.Id,
            Property1 = source.Property1
        };
}

though it will have limited usability caused you will not able to reuse it nested projections, only in simple like this:

ctx.Items.Select(ItemExtensionMethods.MapToMinimalItemExpr)
Guru Stron
  • 102,774
  • 10
  • 95
  • 132
  • Thanks for pointing me in the right direction on what is required under the hood for this to work. I played around a bit with this, and it felt more natural to put the entire Select into an extension method and avoid handling Expressions directly. I can't reuse the mapping code for cases outside LINQ in neither case anyway. – Mad Scientist Jun 01 '20 at 11:43
  • @MadScientist 1) actually you can, just call `Compile` on expression and save it somewhere: `class ItemExtensionMethods { .... MapToMinimalItem = ItemExtensionMethods.MapToMinimalItemExpr.Compile();}` 2) I don't like it but [`Automapper`](https://docs.automapper.org/en/stable/Queryable-Extensions.html) has an option to reuse it's mapping in select clauses. – Guru Stron Jun 01 '20 at 11:56
  • @GuruStron can be used in nested projections by forcing `IQueryable` using `AsQueryable` before the `Select` – joakimriedel Dec 14 '20 at 08:40