I have this Linq query:
DataBase
.GroupBy(data => new { data.X, data.Y }, (key, group) => new
{
key.X,
key.Y,
Somethings = group.Sum(x => x.Something),
[Code shortened for brevity]
})
.OrderBy(row => row.X)
.ThenBy(row => row.Y)
When using Linq-to-SQL it translates beautifully into the SQL query I was expecting, something like:
SELECT [t1].[x] AS [X], [t1].[y] AS [Y], [t1].[value] AS [Somethings], [...]
FROM (
SELECT SUM([t0].[something]) AS [value], [...], [t0].[x], [t0].[y]
FROM [table] AS [t0]
GROUP BY [t0].[x], [t0].[y]
) AS [t1]
ORDER BY [t1].[x], [t1].[y]
The point above being the SUM
performed in the database. However, with Linq-to -Entities in Entity Framework Core (2.0), I get something like the following:
SELECT [data0].[x], [data0].[y], [data0].[something], [...]
FROM [table] AS [data0]
ORDER BY [data0].[x], [data0].[y]
The point this time being no summarizing the data at database level (but rather in memory after all the data has been fetched). It should come as no surprise that this takes a lot longer.
I've searched and found similar questions answered, but nothing that I get in terms of how it translates to my specific query. I have two questions:
- How can I produce the optimized result (summarizing at database level) with Linq-to-Entities?
- As a bonus, if you know to explain some sort of reasoning behind this all that makes the behavior predictable?
(I know I can execute specific SQL using DbCommand
or something, but that's not the answer I'm looking for - I want Linq ;)
[EDIT] It's been suggested that this question and its answer makes my question a duplicate of the same. I read that Q'n'A, but wasn't satisfied, before posting my own question. Here are some (of my) reasons why that was:
- The claim in the answer that
GroupBy
is always processed in-memory seemed unsubstantiated (but more on that below). - That other question is about a specific query generating (among other things) errors, and the answer contains a solution to that problem which is not applicable to my situation.
- That other question - or rather the answer - does not specifically address the performance issues (of my context), and does not contain much of a discussion on how Linq queries translate to SQL in EF Core.
The answer makes claims without referencing versions etcetera, and may soon become obsolete.
So, what about the GroupBy
-in-memory claim? Well, I agree that this piece of information is relevant to my case. Obviously summarizing on groups at the database level is not possible if grouping is performed in-memory. In the comments section of the above mentioned answer, there is indeed also a reference to the EF Core Road-map that I overlooked before. From what I gather, proper grouping will be included in the 2.1 release scheduled for Q4 2017.
I now think that the answer to my question here, pending that release of EF Core 2.1, should be a clear reference to the documentation and road-map. It remains to be seen if the actual summarizing problem that I have will be solved along with the grouping issues. (I don't see why it shouldn't, but still...)
Such an answer could be updated when new information becomes available.