We are currently trying to optimize the performance of our Entity Framework queries. In particular, we look for ways to reduce CPU usage.
Using dotTrace, we analyzed what costs the most CPU time when executing different queries. See the snapshot below:
This snapshot is from a rather simple query, but still it shows which is the most time consuming operation: GetExecutionPlan(). Drilling into this even more, it can be seen that much time is used in the method ComputeHashValue() which is recursively called for all nodes in the expression tree.
This blog post states that
The Entity Framework will walk the nodes in the expression tree and create a hash which becomes the key used to place it in the query cache.
So it seems that the hash values are only used as the key for the query cache. Since we are using IEnumerable.Contains() in our queries, EF will not chache them (see this MSDN article (chapters 3.2 & 4.1). Therefore, we disabled Query Plan Caching like so:
var objectContext = ((IObjectContextAdapter)dbContext).ObjectContext;
var objectSet = objectContext.CreateObjectSet<Customer>();
objectSet.EnablePlanCaching = false;
// use objectSet for queries..
We hoped that then ComputeHashValue() would not be called anymore. However, there was no change in the Call Tree shown by dotTrace and performance was identical as with Query Plan Caching enabled.
Is there a reason why ComputeHashValue() is still needed when Query Plan Caching is disabled?
For our more complex queries, all calls to ComputeHashValue() take up to 70% of the whole CPU time needed for the query execution, so avoiding these calls (if they are not needed) would impact our performance massively.