3

I have this Linq query:

DataBase
    .GroupBy(data => new { data.X, data.Y }, (key, group) => new
    {
        key.X,
        key.Y,
        Somethings = group.Sum(x => x.Something),

        [Code shortened for brevity]

    })
    .OrderBy(row => row.X)
    .ThenBy(row => row.Y)

When using Linq-to-SQL it translates beautifully into the SQL query I was expecting, something like:

SELECT [t1].[x] AS [X], [t1].[y] AS [Y], [t1].[value] AS [Somethings], [...]
FROM (
    SELECT SUM([t0].[something]) AS [value], [...], [t0].[x], [t0].[y]
    FROM [table] AS [t0]
    GROUP BY [t0].[x], [t0].[y]
    ) AS [t1]
ORDER BY [t1].[x], [t1].[y]

The point above being the SUM performed in the database. However, with Linq-to -Entities in Entity Framework Core (2.0), I get something like the following:

SELECT [data0].[x], [data0].[y], [data0].[something], [...]
FROM [table] AS [data0]
ORDER BY [data0].[x], [data0].[y]

The point this time being no summarizing the data at database level (but rather in memory after all the data has been fetched). It should come as no surprise that this takes a lot longer.

I've searched and found similar questions answered, but nothing that I get in terms of how it translates to my specific query. I have two questions:

  1. How can I produce the optimized result (summarizing at database level) with Linq-to-Entities?
  2. As a bonus, if you know to explain some sort of reasoning behind this all that makes the behavior predictable?

(I know I can execute specific SQL using DbCommand or something, but that's not the answer I'm looking for - I want Linq ;)

[EDIT] It's been suggested that this question and its answer makes my question a duplicate of the same. I read that Q'n'A, but wasn't satisfied, before posting my own question. Here are some (of my) reasons why that was:

  • The claim in the answer that GroupBy is always processed in-memory seemed unsubstantiated (but more on that below).
  • That other question is about a specific query generating (among other things) errors, and the answer contains a solution to that problem which is not applicable to my situation.
  • That other question - or rather the answer - does not specifically address the performance issues (of my context), and does not contain much of a discussion on how Linq queries translate to SQL in EF Core.
  • The answer makes claims without referencing versions etcetera, and may soon become obsolete.

So, what about the GroupBy-in-memory claim? Well, I agree that this piece of information is relevant to my case. Obviously summarizing on groups at the database level is not possible if grouping is performed in-memory. In the comments section of the above mentioned answer, there is indeed also a reference to the EF Core Road-map that I overlooked before. From what I gather, proper grouping will be included in the 2.1 release scheduled for Q4 2017.

I now think that the answer to my question here, pending that release of EF Core 2.1, should be a clear reference to the documentation and road-map. It remains to be seen if the actual summarizing problem that I have will be solved along with the grouping issues. (I don't see why it shouldn't, but still...)

Such an answer could be updated when new information becomes available.

Oskar Lindberg
  • 2,255
  • 2
  • 16
  • 36
  • Possible duplicate of [GroupBy first result of lookup table in Entity Framework Core](https://stackoverflow.com/questions/41569461/groupby-first-result-of-lookup-table-in-entity-framework-core) – CodeNotFound Aug 24 '17 at 21:34
  • 1
    The answer to this question said that GroupBy ios always processed in memory :) – CodeNotFound Aug 24 '17 at 21:34
  • 1
    *The answer makes claims without referencing versions* You seem to be missing *The following applies to latest at this time EF Core **v1.1.0*** :) Anyway, your question is not duplicate, but like most of the EFC questions (and answers, which are basically workarounds, and not solutions) caused by EFC being released with lot of bugs, limitations and incomplete features will become obsolete sooner or later. What about the performance, of course that's the main problem with EFC *client evaluation* "feature" - slow, but "works". But that "feature" allowed releasing EFC earlier than it should. – Ivan Stoev Aug 26 '17 at 13:39
  • 1
    Yes, I did miss that. Can't explain how I managed to do so. Sorry. I feel stupid. Thanks for your patience f^_^; – Oskar Lindberg Aug 26 '17 at 14:56
  • 1
    No need to feel stupid. EF6 and earlier never auto-switched to client-side evaluation. LINQ-to-SQL did (does), but only for functions that couldn't be translated to SQL for (more) obvious reasons. I'd call this highly unexpected behavior for a mature ORM --which is an implicit qualification of EF-core. – Gert Arnold Aug 26 '17 at 20:14

0 Answers0