12

I've done a bit of research on this, and the best I've found so far is to use an Asenumerable on the whole dataset, so that the filtering occurs in linq to objects rather than on the DB. I'm using the latest EF.

My working (but very slow) code is:

        var trendData = 
            from d in ExpenseItemsViewableDirect.AsEnumerable()
            group d by new {Period = d.Er_Approved_Date.Year.ToString() + "-" + d.Er_Approved_Date.Month.ToString("00") } into g
            select new
            {
                Period = g.Key.Period,
                Total = g.Sum(x => x.Item_Amount),
                AveragePerTrans = Math.Round(g.Average(x => x.Item_Amount),2)
            };

This gives me months in format YYYY-MM, along with the total amount and average amount. However it takes several minutes every time.

My other workaround is to do an update query in SQL so I have a YYYYMM field to group natively by. Changing the DB isn't an easy fix however so any suggestions would be appreciated.

The thread I found the above code idea (http://stackoverflow.com/questions/1059737/group-by-weeks-in-linq-to-entities) mentions 'waiting until .NET 4.0'. Is there anything recently introduced that helps in this situation?

Glinkot
  • 2,924
  • 10
  • 42
  • 67

3 Answers3

16

The reason for poor performance is that the whole table is fetched into memory (AsEnumerable()). You can group then by Year and Month like this

var trendData = 
            (from d in ExpenseItemsViewableDirect
            group d by new {
                            Year = d.Er_Approved_Date.Year, 
                            Month = d.Er_Approved_Date.Month 
                            } into g
            select new
            {
                Year = g.Key.Year,
                Month = g.Key.Month,
                Total = g.Sum(x => x.Item_Amount),
                AveragePerTrans = Math.Round(g.Average(x => x.Item_Amount),2)
            }
       ).AsEnumerable()
        .Select(g=>new {
              Period = g.Year + "-" + g.Month,
              Total = g.Total,
               AveragePerTrans = g.AveragePerTrans
         });

edit

The original query, from my response, was trying to do a concatenation between an int and a string, which is not translatable by EF into SQL statements. I could use SqlFunctions class, but the query it gets kind ugly. So I added AsEnumerable() after the grouping is made, which means that EF will execute the group query on server, will get the year, month, etc, but the custom projection is made over objects (what follows after AsEnumerable()).

Adrian Iftode
  • 15,465
  • 4
  • 48
  • 73
  • 1
    Excellent, many thanks Adrian. I just did a quick test and yours takes about 3.5 secs compared to 5.2 for the original. The long delay must be some other step in my program. I appreciate your efforts, your code has been put to work! – Glinkot Mar 27 '12 at 10:49
  • This should not be the accepted answer. This is only a workaround! The answer from @cryss give a perfect result. – Steffen Mangold Nov 26 '15 at 14:45
  • Excellent solution, "group by new" is really powerful and simplifies complicated queries. – Jacob Feb 09 '17 at 22:10
9

When it comes to group by month i prefer to do this task in this way:

var sqlMinDate = (DateTime) SqlDateTime.MinValue;

var trendData = ExpenseItemsViewableDirect
    .GroupBy(x => SqlFunctions.DateAdd("month", SqlFunctions.DateDiff("month", sqlMinDate, x.Er_Approved_Date), sqlMinDate))
    .Select(x => new
    {
        Period = g.Key // DateTime type
    })

As it keeps datetime type in the grouping result.

cryss
  • 4,130
  • 1
  • 29
  • 34
2

Similarly to what cryss wrote, I am doing the following for EF. Note we have to use EntityFunctions to be able to call all DB providers supported by EF. SqlFunctions only works for SQLServer.

var sqlMinDate = (DateTime) SqlDateTime.MinValue; 

(from x in ExpenseItemsViewableDirect
let month = EntityFunctions.AddMonths(sqlMinDate, EntityFunctions.DiffMonths(sqlMinDate, x.Er_Approved_Date))
group d by month 
into g
select new
{
Period = g.Key,
   Total = g.Sum(x => x.Item_Amount),
   AveragePerTrans = Math.Round(g.Average(x => x.Item_Amount),2)
}).Dump();

A taste of generated SQL (from a similar schema):

-- Region Parameters
DECLARE @p__linq__0 DateTime2 = '1753-01-01 00:00:00.0000000'
DECLARE @p__linq__1 DateTime2 = '1753-01-01 00:00:00.0000000'
-- EndRegion
SELECT 
1 AS [C1], 
[GroupBy1].[K1] AS [C2], 
[GroupBy1].[A1] AS [C3]
FROM ( SELECT 
    [Project1].[C1] AS [K1], 
    FROM ( SELECT 
        DATEADD (month, DATEDIFF (month, @p__linq__1, [Extent1].[CreationDate]), @p__linq__0) AS [C1]
        FROM [YourTable] AS [Extent1]
    )  AS [Project1]
    GROUP BY [Project1].[C1]
)  AS [GroupBy1]
NetMage
  • 26,163
  • 3
  • 34
  • 55
Boris
  • 471
  • 4
  • 8