4

I have a table called Products and I need to find the products with unique title for a particular category. Earlier we used to do with this query in entity framework core 2.2 :

currentContext.Products
              .GroupBy(x => x.Title)
              .Select(x => x.FirstOrDefault()))
              .Select(x => new ProductViewModel
                 {
                     Id = x.Id,
                     Title = x.Title,
                     CategoryId= x.CategoryId
                 }).ToList();

But after upgrading to Entity Framework Core 5.0, we get an error for Groupby Shaker exception:

The LINQ expression 'GroupByShaperExpression:KeySelector: t.title, ElementSelector:EntityShaperExpression: EntityType: Project ValueBufferExpression: ProjectionBindingExpression: EmptyProjectionMember IsNullable: False .FirstOrDefault()' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'.

I know there are multiple way to client projection but I am searching for most efficient way to search.

Jalpesh Vadgama
  • 13,653
  • 19
  • 72
  • 94
  • 1
    I assume upgrading from 2.x to 5.0. Because this query is not valid for LINQ to Entities. – Svyatoslav Danyliv Mar 17 '21 at 07:14
  • 2
    The exception message is telling you that `FirstOrDefault` over `GroupBy` result is not supported. So either take the client evaluation switch advice (since your query was really client evaluated in pre EFC 3.0), or seek SO for how to rewrite the query - there are a lot of similar questions since it is a common problem. – Ivan Stoev Mar 17 '21 at 07:39
  • Yes it is a problem from migrating from 2.2 to 5.0 – Jalpesh Vadgama Mar 17 '21 at 08:28
  • Maybe an underlying problem is that `Title` isn't unique. That doesn't seem a heathy condition for a `Product` table. – Gert Arnold Mar 17 '21 at 10:31
  • There are other aspects of table like the color of product etc. But that does not matter for this query. And it is working fine in 2.2 one but in entity framework 5.0 I want to see the best way to find that. – Jalpesh Vadgama Mar 19 '21 at 05:12
  • 2
    (1) The problem is NOT the upgrading. Your query haven't been translated to SQL in 2.x. It was "working" by silently evaluating it client side. The equivalent of inserting `AsEnumerable()` before `GroupBy`. if you didn't care before, why do care now? (2) As I wrote in the previous comment, the so called "Top N items per group" is very common, but not handled by EFC, so has been asked and answered many times, which you would have found if searched SO as suggested. I can easily adjust for your case and drop one of my own answers, but that doesn't make sense - the question is simply dupe. – Ivan Stoev Mar 24 '21 at 20:20
  • Here is the exact open issue [Translate GroupBy followed by FirstOrDefault over group #12088](https://github.com/dotnet/efcore/issues/12088). As you can see, it is scheduled for 6.0, so wait till mid Nov and you'll be fine. – Ivan Stoev Mar 24 '21 at 20:30
  • @JalpeshVadgama I have posted two possible workarounds, depending on what you want to do. Let me know if you give them a try. – dglozano Mar 29 '21 at 06:47

5 Answers5

3

Most likely that LINQ query couldn't be translated in EF Core 2.2 either, because of some limitations that the GroupBy operator has.

From the docs:

Since no database structure can represent an IGrouping, GroupBy operators have no translation in most cases. When an aggregate operator is applied to each group, which returns a scalar, it can be translated to SQL GROUP BY in relational databases. The SQL GROUP BY is restrictive too. It requires you to group only by scalar values. The projection can only contain grouping key columns or any aggregate applied over a column.

What happened in EF Core 2.x is that whenever it couldn't translate an expression, it would automatically switch to client evaluation and give just a warning.

This is listed as the breaking change with highest impact when migrating to EF Core >= 3.x :

Old behavior

Before 3.0, when EF Core couldn't convert an expression that was part of a query to either SQL or a parameter, it automatically evaluated the expression on the client. By default, client evaluation of potentially expensive expressions only triggered a warning.

New behavior

Starting with 3.0, EF Core only allows expressions in the top-level projection (the last Select() call in the query) to be evaluated on the client. When expressions in any other part of the query can't be converted to either SQL or a parameter, an exception is thrown.

So if the performance of that expression was good enough when using EF Core 2.x, it will be as good as before if you decide to explicitly switch to client evaluation when using EF Core 5.x. That's because both are client evaluated, before and now, with the only difference being that you have to be explicit about it now. So the easy way out, if the performance was acceptable previously, would be to just client evaluate the last part of the query using .AsEnumerable() or .ToList().

If client evaluation performance is not acceptable (which will imply that it wasn't before the migration either) then you have to rewrite the query. There are a couple of answers by Ivan Stoev that might get you inspired.

I am a little confused by the description of what you want to achieve: I need to find the products with unique title for a particular category and the code you posted, since I believe it's not doing what you explained. In any case, I will provide possible solutions for both interpretations.

This is my attempt of writing a query to find the products with unique title for a particular category.

var uniqueProductTitlesForCategoryQueryable = currentContext.Products
              .Where(x => x.CategoryId == categoryId)
              .GroupBy(x => x.Title)
              .Where(x => x.Count() == 1)
              .Select(x => x.Key); // Key being the title

var productsWithUniqueTitleForCategory = currentContext.Products
              .Where(x => x.CategoryId == categoryId)
              .Where(x => uniqueProductTitlesForCategoryQueryable .Contains(x.Title))
              .Select(x => new ProductViewModel
                 {
                     Id = x.Id,
                     Title = x.Title,
                     CategoryId= x.CategoryId
                 }).ToList();

And this is my attempt of rewriting the query you posted:

currentContext.Products
              .Select(product => product.Title)
              .Distinct()
              .SelectMany(uniqueTitle => currentContext.Products.Where(product => product.Title == uniqueTitle ).Take(1))
              .Select(product => new ProductViewModel
                 {
                     Id = product.Id,
                     Title = product.Title,
                     CategoryId= product.CategoryId
                 })
              .ToList();

I am getting the distinct titles in the Product table and per each distinct title I get the first Product that matches it (that should be equivalent as GroupBy(x => x.Title)+ FirstOrDefault AFAIK). You could add some sorting before the Take(1) if needed.

dglozano
  • 6,369
  • 2
  • 19
  • 38
  • At least you show that a very ugly work-around is all we have at the moment, but the work-around itself isn't correct. Selecting groups with 1 item is not the same as selecting the first item of a group. – Gert Arnold Mar 28 '21 at 17:40
  • @GertArnold that's true, but I am a little confused by the description he provided of what he wants to achieve and the code he posted. My intention was to provide code for what he explained as `I need to find products with unique titles for a specific category` (but I might still be wrong...). Could rewrite the code if OP clarifies – dglozano Mar 28 '21 at 17:46
  • @GertArnold have updated my answer to cover, I believe, both possible interpretations – dglozano Mar 28 '21 at 19:00
0

You can use Join for this query as below :

currentContext.Products
                .GroupBy(x => x.Title)
                .Select(x => new ProductViewModel() 
                { 
                    Title = x.Key,
                    Id = x.Min(b => b.Id) 
                })
                .Join(currentContext.Products, a => a.Id, b => b.Id, 
                     (a, b) => new ProductViewModel()
                {
                    Id = a.Id,
                    Title = a.Title,
                    CategoryId = b.CategoryId
                }).ToList(); 

If you watch or log translated SQL query, it would be as below:

SELECT [t].[Title], [t].[c] AS [Id], [p0].[CategoryId] AS [CategoryId]
FROM (
    SELECT [p].[Title], MIN([p].[Id]) AS [c]
    FROM [Product].[Products] AS [p]
    GROUP BY [p].[Title]
) AS [t]
INNER JOIN [Product].[Products] AS [p0] ON [t].[c] = [p0].[Id]

As you can see, the entire query is translated into one SQL query and it is highly efficient because GroupBy operation is being performed in database and no additional record is fetched by the client.

Navid Rsh
  • 308
  • 1
  • 6
  • 14
  • 1
    I was thinking of a similar solution, but I think it's hardly a work-around. OP tries to do `x.FirstOrDefault()`, but that only makes sense when there is some ordering. Using `Min` (or `Max`) `Id` is not a sensible ordering IMO. It's a very common requirement to get the most recent item in a group (by some date). Can't be done now without jumping through hoops, or not al all, it's crazy. – Gert Arnold Mar 28 '21 at 18:40
  • In most cases `Id` is an increment, so the first occurrence has the lowest Id. But in other scenarios, you are right, the second part of the query needs some modifications to get the first occurrence. maybe it can be achieved by another `GroupBy` and `Take(1)`. – Navid Rsh Mar 28 '21 at 20:11
0

As mentioned by Ivan Stoev, EFC 2.x just silently loads full table to the client side and then apply needed logic for extracting needed result. It is resource consuming way and thanks that EFC team uncovered such potential harmful queries.

Most effective way is already known - raw SQL and window functions. SO is full of answers like this.

SELECT 
   s.Id,
   s.Title,
   s.CategoryId
FROM 
  (SELECT 
     ROW_NUMBER() OVER (PARTITION BY p.Title ORDER BY p.Id) AS RN,
     p.*
  FROM Products p) s
WHERE s.RN = 1

Not sure that EFC team will invent universal algorithm for generating such SQL in nearest future, but for special edge cases it is doable and maybe it is their plan to do that for EFC 6.0

Anyway if performance and LINQ is priority for such question, I suggest to try our adaptation of linq2db ORM for EF Core projects: linq2db.EntityFrameworkCore

And you can get desired result without leaving LINQ:

urrentContext.Products
    .Select(x =>  new 
    { 
        Product = x,
        RN = Sql.Ext.RowNumber().Over()
            .PartitionBy(x.Title)
            .OrderBy(x.Id)
            .ToValue()
    })
    .Where(x => x.RN == 1)
    .Select(x => x.Product)
    .Select(x => new ProductViewModel
        {
            Id = x.Id,
            Title = x.Title,
            CategoryId = x.CategoryId
        })
    .ToLinqToDB()
    .ToList();
Svyatoslav Danyliv
  • 21,911
  • 3
  • 16
  • 32
-2

Short answer is you deal with breaking changes in EF Core versions.

You should consider the total API and behavior changes for migration from 2.2 to 5.0 as I provided bellow:

You may face other problems to write valid expressions using the newer version. In my opinion, upgrading to a newer version is not important itself. This is important to know how to work with a specific version.

Amirhossein Mehrvarzi
  • 18,024
  • 7
  • 45
  • 70
-3

You should use .GroupBy() AFTER materialization. Unfortunately, EF core doesn't support GROUP BY. In version 3 they introduced strict queries which means you can not execute IQeuriables that can't be converted to SQL unless you disable this configuration (which is not recommended). Also, I'm not sure what are you trying to get with GroupBy() and how it will influence your final result. Anyway, I suggest you upgrade your query like this:

currentContext.Products
          .Select(x=> new {
             x.Id,
             x.Title,
             x.Category
          })
          .ToList()
          .GroupBy(x=> x.Title)
          .Select(x => new Wrapper
             { 
                 ProductsTitle = x.Key,
                 Products = x.Select(p=> new ProductViewModel{
                       Id = p.Id,
                       Title = p.Title,
                       CategoryId= p.CategoryId
                 }).ToList()
             }).ToList();
dantey89
  • 2,167
  • 24
  • 37