3

As a simple example let's assume we have an IQueryable named "myQueryable" with 3 columns Id, Value1 and Value2 and we write the following LINQ expression:

var result = myQueryable.Select(g => new 
{ 
    Value1Sum = g.Sum(b => b.Value1),
    Value2Sum = g.Sum(b => b.Value2)                                  
});

This will fire 2 different queries, one for each sum and then return both of them as an object like

{ Value1Sum = x , Value2Sum = y }

Is it possible to build the LINQ Expression in such a way that only one query is fired for both sums? After all they are both calculated on the basis of the same dataset..

To make it clearer, I'd expect a query like this:

SELECT SUM(Value1), SUM(Value2) FROM MyTable

but LINQ generates two queries like this(can't post the actual queries fired for security reasons):

exec sp_executesql N'SELECT 
[GroupBy1].[A1] AS [C1]
FROM ( SELECT 
    SUM([Extent1].[Value1]) AS [A1]
    FROM [mo].[MyTable] AS [Extent1]) AS [GroupBy1]

..same for Value2. Please note that even if it calls the inner select "GroupBy1" there is no grouping anywhere (also in the actual queries)

jimmious
  • 358
  • 3
  • 19

1 Answers1

2

Yes, you can do that in a single query to your database.

First, get all the values you need.

var values = MyQueryable.Select(item => new { Value1 = item.Value1 , Value2 = item.Value2 })
.ToList();

You use ToList() to force the query to execute. The Select() is only to minimize the data you query. You might as well use

var values = MyQueryable.ToList();

This is your only trip to the database.

Now that you have all data locally, you can aggregate these values.

Tuple<int, int> sums = values.Aggregate(
new Tuple<int, int>(0,0),
(tuple, item) => new Tuple<int, int>(tuple.Item1 + item.Value1, tuple.Item2 + item.Value2));

Afterwards you have a tuple sums where sums.Item1 represents the sum of all Value1 and sums.Item2 represents the sum of all Value2 respectively.

Can I use Aggregate directly on my IQueryable<T>?

Unfortunately, no. Aggregate() is not supported in LINQ to Entities, so you cannot use it on IQueryable<T> directly. This is why you first need to get the data locally so you have an IEnumerable<T>. Then you can aggregate the results using LINQ to objects. See this StackOverflow answer on the distinction between the two.

Community
  • 1
  • 1
Søren D. Ptæus
  • 4,176
  • 1
  • 26
  • 28
  • `First, get all the values you need.` what if table contains billions of rows? why do I need to fetch all of them if I need only two numbers? In that case it is much much better to do 2 db trips to get sum for val1 and sum for val2 rather then 1 db trip to get billions of rows – Dmitry Pavliv Jun 07 '18 at 12:47
  • @DmitryPavliv I get what you are saying. But you do not need only two numbers. You really need **two values for each row** otherwise you can not sum the values up. If you have **too many rows** to get them locally all at once you will need to get and sum them up in **batches** using `.Skip(entriesFinishedCount).Take(batchSize).ToList()`. See [How to loop through IEnumerable in batches](https://stackoverflow.com/questions/15414347/how-to-loop-through-ienumerable-in-batches). This also applies to `IQueryable`. Of course this means one query for each batch, but if you db is too big, that's it. – Søren D. Ptæus Jun 07 '18 at 12:53
  • 1
    Reading billions of rows (even in batches) and transferring them through network is a big overhead if you need only two values. Leave db work for db. Question is not about reducing # of db trips but minimizing resources usage. You're increasing load on db, backend and network – Dmitry Pavliv Jun 07 '18 at 12:57