Can't use DB indexes with Entity Framework using where clauses

Question

We are using entity framework to work with a table with 50 million entries (at a Microsoft SQL Server).

public async Task<List<TableName>> MyMethod(int fooId, int count)
{
    using (var context = new Context(ConnectionString))
    {
        return
            await context.TableName.AsNoTracking()
                .Where(_ => _.FooId== fooId)
                .OrderByDescending(_ => _.DateCreated)
                .Take(count)
                .ToListAsync();
    }
}

Entity Framework translates this to (beautified):

declare @fooId int = 42
SELECT TOP (100) *
FROM TableName
WHERE FooId = @fooId
ORDER BY DateCreated DESC

The columns FooId and DateCreated both got an index, but SQL Server does a full table scan anyway, which takes very long. This is caused because 42 is assigned outside of the statement (and seems to be related to selectivity). The indexes would be used if you wrote

WHERE FooId = 42

Is there a way to get Entity Framework to optimize the generated query? At the moment, my only way seems to be using raw SQL queries for huge tables in my code :-/ Is there a better workaround?

Edit: More details as requested in the comments: Non beautified query that is generated by entity framework:

SELECT TOP (100) 
    [Project1].[DateCreated] AS [DateCreated], 
    [Project1].[FooId] AS [FooId]
    FROM ( SELECT 
        [Extent1].[DateCreated] AS [DateCreated], 
        [Extent1].[FooId] AS [FooId]
        FROM [dbo].[TableName] AS [Extent1]
        WHERE [Extent1].[FooId] = @p__linq__0
    )  AS [Project1]
    ORDER BY [Project1].[DateCreated] DESC

-- p__linq__0: '42' (Type = Int32, IsNullable = false)

The create script for the index:

CREATE NONCLUSTERED INDEX [IX_TableName_FooId] ON [dbo].[TableName]
(
    [FooId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 85) ON [SECONDARY]
GO

Create script for my table:

CREATE TABLE [dbo].[TableName](
    [DateCreated] [datetime] NOT NULL,
    [FooId] [int] NULL
) ON [PRIMARY]

Just to clarify: have you tried hard coding the value and confirmed that indexes are used? How many unique values of `FooId` are there? Is there any chance you ran this statement against an empty table then populated the table? — Nick.Mc, Nov 22 '17 at 09:34
I added the exact query and the index create script, hope it helps. — TheWho, Nov 22 '17 at 10:42
No, it did not make a difference, if I move AsNoTracking() after Take. There are about 50 different entries for FooId = 42 in the database. — TheWho, Nov 22 '17 at 10:45
@Nick.McDermaid: I can reproduce the problem with my generated query in Management Studio. If I replace the variable there with the hard coded value it works like a charm. — TheWho, Nov 22 '17 at 10:49
`I can reproduce the problem with my generated query in Management Studio` When you say 'reproduce the problem' what do you mean exactly? Do you mean it is slower? Or you have examined the execution plan? Or something else? Does the problem occur if you **remove** `.AsNoTracking()`? — mjwills, Nov 22 '17 at 11:01
With 'reproducing the problem in Management Studio', I mean that the query takes a few mili seconds when executed with a hard coded value and over a minute when using a variable. — TheWho, Nov 22 '17 at 11:08
Does updating statistics or clearing the cache help? https://technet.microsoft.com/en-us/library/ms187348(v=sql.110).aspx https://stackoverflow.com/questions/1873025/how-can-i-clear-the-sql-server-query-cache — mjwills, Nov 22 '17 at 11:43
>>>I added the excution plan and the create statement <<< This is not a plan. It's hust a picture where no one can see operators properties. Actual plan should be placed here: https://www.brentozar.com/pastetheplan/ — sepupic, Nov 22 '17 at 13:46
Updateing the statistics and clearing the cache did not help. — TheWho, Nov 24 '17 at 08:37
We are experimenting with recompile at the moment, this seems to work :-) — TheWho, Nov 24 '17 at 08:38

Denis Rubashkin · Answer 1 · 2017-11-22T10:49:53.820

It's a problem with parameter sniffing.

Your query will be executed by the stored procedure which gets an input parameter:

EXEC sp_executesql N'SELECT TOP (100) *
FROM TableName
WHERE FooId = = @p__linq__0
ORDER BY DateCreated DESC', ' @p__linq__0 int', @p__linq__0 = 42

You should change the query to change the query parameter to a local variable:

EXEC sp_executesql N'
    DECLARE @pp__linq__0 int = @p__linq__0
    SELECT TOP (100) *
    FROM TableName
    WHERE FooId = = @pp__linq__0
    ORDER BY DateCreated DESC', ' @p__linq__0 int', @p__linq__0 = 42

For example our solution for this issue:

public class ParamPositioningInterceptor : DbCommandInterceptor
{
    public override void ReaderExecuting(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext)
    {
        if (command.CommandText.StartsWith("SELECT") && command.Parameters.Count > 0)
        {
            StringBuilder sb1 = new StringBuilder(command.CommandText);
            StringBuilder sb2 = new StringBuilder();

            SqlParameter[] array = new SqlParameter[command.Parameters.Count];
            command.Parameters.CopyTo(array, 0);

            foreach (SqlParameter p in array.OrderByDescending(x => x.ParameterName.Length))
            {
                sb1.Replace("@" + p, "@p" + p);

                switch (p.SqlDbType)
                {
                    case SqlDbType.Char:
                    case SqlDbType.VarChar:
                    case SqlDbType.NChar:
                    case SqlDbType.NVarChar:
                        sb2.AppendFormat("DECLARE @p{0} {1}({2}) = @{0}", p, p.SqlDbType, p.Size);
                        break;
                    case SqlDbType.Decimal:
                        sb2.AppendFormat("DECLARE @p{0} {1}({2},{3}) = @{0}", p, p.SqlDbType, p.Precision, p.Scale);
                        break;
                    default:
                        sb2.AppendFormat("DECLARE @p{0} {1} = @{0}", p, p.SqlDbType);
                        break;
                }

                sb2.AppendLine();
            }

            command.CommandText = sb2.Append(sb1).ToString();
        }
        //
        base.ReaderExecuting(command, interceptionContext);
    }
}

Too bad I can only accept one answer. We added a recompile like @sepupic explained with a DBInterceptor, similar to your answer. This is working like a charm and the query takes a few milliseconds instead of minutes. — TheWho, Dec 04 '17 at 11:21

score 0 · Accepted Answer · edited Nov 24 '17 at 09:40

If your code is executed exactly like this:

declare @fooId int = 42
SELECT TOP (100) *
FROM TableName
WHERE FooId = @fooId
ORDER BY DateCreated DESC

i.e. has no parameters but local variables only, the estimated number of rows is C^1/2 (C = table cardinality) with non unique value. It will imply full scan.

You can try to pass the real value by using option(recompile).

This will cause your statement to be recompiled when the variable is already assigned, i.e. execution plan will consider a value that you effectivly passed.

To see what is actually executed and what was the estimation and real number of rows you should provide the actual execution plan.

We added a recompile like you explained with a DBInterceptor, similar to @Denis Rubashkins answer. This is working like a charm and the query takes a few milliseconds instead of minutes. — TheWho, Dec 04 '17 at 11:20

Can't use DB indexes with Entity Framework using where clauses

2 Answers2

Linked