6

We are currently trying to optimize the performance of our Entity Framework queries. In particular, we look for ways to reduce CPU usage.

Using dotTrace, we analyzed what costs the most CPU time when executing different queries. See the snapshot below: dotTrace Call Tree

This snapshot is from a rather simple query, but still it shows which is the most time consuming operation: GetExecutionPlan(). Drilling into this even more, it can be seen that much time is used in the method ComputeHashValue() which is recursively called for all nodes in the expression tree.

This blog post states that

The Entity Framework will walk the nodes in the expression tree and create a hash which becomes the key used to place it in the query cache.

So it seems that the hash values are only used as the key for the query cache. Since we are using IEnumerable.Contains() in our queries, EF will not chache them (see this MSDN article (chapters 3.2 & 4.1). Therefore, we disabled Query Plan Caching like so:

var objectContext = ((IObjectContextAdapter)dbContext).ObjectContext;
var objectSet = objectContext.CreateObjectSet<Customer>();
objectSet.EnablePlanCaching = false;
// use objectSet for queries..

We hoped that then ComputeHashValue() would not be called anymore. However, there was no change in the Call Tree shown by dotTrace and performance was identical as with Query Plan Caching enabled.

Is there a reason why ComputeHashValue() is still needed when Query Plan Caching is disabled?

For our more complex queries, all calls to ComputeHashValue() take up to 70% of the whole CPU time needed for the query execution, so avoiding these calls (if they are not needed) would impact our performance massively.

Fabian Gehri
  • 133
  • 5
  • According to the screenshot almost no time is spent there. Is this profile not representative of your workload? – usr May 03 '14 at 20:11
  • 1
    The 3 calls are denoted with 0ms here. But, ComputeHashValue() is called for each node in the expression tree which in total leads to hundreds of calls and almost all of the 93ms are used by them. You can see the various recursive calls to ApplyRulesToSubtree() which all ultimately lead to calls to ComputeHashValue(). – Fabian Gehri May 04 '14 at 11:16

1 Answers1

0

Unfortunately, that's not how Entity Framework was implemeted. I've looked a bit at the source code and my understanding is because it is compiling an ExecutionPlan anyway, it also calculates it's HashValue. This is because if EnablePlanCaching is enabled and it could not find a cached query, It can then add it to the cache manager based on this ComputedValue.

Here is a link to the class that handles this logic: EntitySqlQueryState

Rik van den Berg
  • 2,840
  • 1
  • 18
  • 22
  • Having looked at the source code ([ELinqQueryState](http://entityframework.codeplex.com/SourceControl/latest#src/EntityFramework/Core/Objects/ELinq/ELinqQueryState.cs) since we are using Linq2Entities) in more detail, I think that these calls to `ComputeHashValue()` are not used for generating the cache key. The `cacheKey` is generated first (using `ExpressionKeyGen.TryGenerateKey()` and after that the execution plan is built (call to `_objectQueryExecutionPlanFactory.Prepare()`. So it seems that `ComputeHashValue()` is always needed regardless of whether plan caching is enabled or not. – Fabian Gehri May 04 '14 at 12:15
  • That's what I've tried to explain :). I also saw a lot of MergeOption parameters. Have you tried using MergeOption.NoTracking by calling `AsNoTracking()` LINQ method and see what the performance difference is then? [AsNoTracking](http://msdn.microsoft.com/en-us/library/gg679352(v=vs.103).aspx) – Rik van den Berg May 05 '14 at 11:50
  • AsNoTracking() makes no difference. MergeOption is only used to check if the cached query plan can be reused: "If a merge option was explicitly specified, and it does not match the plan's merge option, then the plan is no longer valid." – Fabian Gehri May 06 '14 at 13:28