I see a very big difference in performance which I'm not able to explain, because they should run exactly the same.
They're considerably different in approach. The first part of your initial method's query:
DbContext.ContactFeedback
.Where(c => c.FeedbackDate >= startDate &&
c.FeedbackDate <= endDate && userIds.Contains(c.UserId) &&
(c.Type == FeedbackType.A || c.Type == FeedbackType.B || c.Type == FeedbackType.C))
Is equivalent to:
DbContext.ContactFeedback
.Where(new Expression<Func<ContactFeedback, bool>>(new Func<ContactFeedback, bool>(c => c.FeedbackDate >= startDate && c.FeedbackDate <= endDate && userIds.Contains(c.UserId) &&
(c.Type == FeedbackType.A || c.Type == FeedbackType.B ||
c.Type == FeedbackType.C)))
When you call .Where
on an IQueryable<T>
it will (barring a case where the type implementing IQueryable<T>
has its own appliable .Where
which would be strange) call into:
public static IQueryable<TSource> Where<TSource>(
this IQueryable<TSource> source,
Expression<Func<TSource, bool>> predicate
)
Bearing in mind that lambdas in source code can be turned into either a Func<…>
or an Expression<Func<…>>
as applicable.
Entity Framework then combines this query with the GroupBy
and finally upon Count()
turns the entire query into the appropriate SELECT COUNT …
query, which the database performs (just how quickly depending on table contents and what indices are set, but which should be reasonably quick) and then a single value is sent back from the database for EF to obtain.
Your version though has explicitly assigned the lambda to a Func<ContactFeedback, bool>
. As such using it with Where
it has to call into:
public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate
)
So to do the Where
EF has to do retrieve every column of every row from the database, and then filter out those rows for which that Func
returns true, then group them in memory (which requires storing the partially-constructed groups) before doing a Count
by a mechanism like:
public int Count<T>(this IEnumerable<T> source)
{
/* some attempts at optimising that don't apply to this case and so in fact just waste a tiny amount omitted */
int tally = 0;
using(var en = source.GetEnumerator())
while(en.MoveNext())
++tally;
return tally;
}
This is a lot more work with a lot more traffic between the EF and database, and so a lot slower.
A rewrite of the sort you attempted would be better approximated by:
public int GetRewritten(DateTime startDate, DateTime endDate, List<int> userIds)
{
DateTime dt = DateTime.UtcNow;
var query = new Expression<Func<ContactFeedback, bool>>(c => c.FeedbackDate >= startDate && c.FeedbackDate <= endDate && userIds.Contains(c.UserId) &&
(c.Type == FeedbackType.A || c.Type == FeedbackType.B ||
c.Type == FeedbackType.C));
var ret = GetTotalLeadsByFeedback(query);
Console.WriteLine(string.Format("{0}",DateTime.UtcNow - dt));
return ret;
}
private int GetTotalLeadsByFeedback(Expression<Func<ContactFeedback, bool>> predicate)
{
return DbContext.ContactFeedback
.Where(predicate)
.GroupBy(x => new { TruncateTime = DbFunctions.TruncateTime(x.FeedbackDate), x.LeadId, x.UserId })
.Count();
}
(Note also that I changed the name of the predicate to predicate
, as predicate
is more commonly used for predicates, query
for a source along with zero or more methods acting upon it; so DbContext.ContactFeedback
, DbContext.ContactFeedback.Where(predicate)
and DbContext.ContactFeedback.Where(predicate).GroupBy(x => new { TruncateTime = DbFunctions.TruncateTime(x.FeedbackDate), x.LeadId, x.UserId })
would all be queries if enumerated, and DbContext.ContactFeedback.Where(predicate).GroupBy(x => new { TruncateTime = DbFunctions.TruncateTime(x.FeedbackDate), x.LeadId, x.UserId }).Count()
is a query that immediately executes and returns a single value).
Conversely, the form you ended up with could be written back into the style of GetOriginal
as:
public int GetNotOriginal(DateTime startDate, DateTime endDate, List<int> userIds)
{
DateTime dt = DateTime.UtcNow;
var ret = DbContext.ContactFeedback
.AsEnumerable()
.Where(c => c.FeedbackDate >= startDate &&
c.FeedbackDate <= endDate && userIds.Contains(c.UserId) &&
(c.Type == FeedbackType.A || c.Type == FeedbackType.B || c.Type == FeedbackType.C))
.GroupBy(x => new {TruncateTime = DbFunctions.TruncateTime(x.FeedbackDate), x.LeadId, x.UserId})
.Count();
Console.WriteLine(string.Format("{0}",DateTime.UtcNow - dt));
return ret;
}
Note the AsEnumerable
forcing the Where
and everything that follows to be executed in the .NET application, rather than on the database.