I was given the following query in SQL (variable names obfuscated) which is trying to get the values (Ch, Wa, Bu, Hi) resulting in the greatest number (cnt) of Pi entries.
select top 1 Pi.Ch, Pi.Wa, Pi.Bu, Pi.Hi, COUNT(1) as cnt
from Product, Si, Pi
where Product.Id = Si.ProductId
and Si.Id = Pi.SiId
and Product.Code = @CodeParameter
group by Pi.Ch, Pi.Wa, Pi.Bu, Pi.Hi
order by cnt desc
which runs instantly in SQL management studio on our production database. I've successfully written the code a few ways in C# LINQ and Entity Framework, but every way the code runs in 8 - 10 seconds. One attempt is the following code (doing it without the print as one call gives the same performance results):
using(var context = new MyEntities()){
var query = context.Products
.Where(p => p.Code == codeFromFunctionArgument)
.Join(context.Sis, p => p.Id, s => s.ProductId, (p, s) => new { sId = s.Id })
.Join(context.Pis, ps => ps.sId, pi => pi.SiId, (ps, pi) => new {pi.Ch, pic.Wa, pic.Bu, pic.Hi})
.GroupBy(
pi => pi,
(k, g) => new MostPisResult()
{
Ch = k.Ch,
Wa = k.Wa,
Bu = k.Bu,
Hi = k.Hi,
Count = g.Count()
}
)
.OrderByDescending(x => x.Count);
Console.WriteLine(query.ToString());
return query.First();
}
}
which outputs the following SQL statements:
SELECT
[Project1].[C2] AS [C1],
[Project1].[Ch] AS [Ch],
[Project1].[Wa] AS [Wa],
[Project1].[Bu] AS [Bu],
[Project1].[Hi] AS [Hi],
[Project1].[C1] AS [C2]
FROM ( SELECT
[GroupBy1].[A1] AS [C1],
[GroupBy1].[K1] AS [Ch],
[GroupBy1].[K2] AS [Wa],
[GroupBy1].[K3] AS [Bu],
[GroupBy1].[K4] AS [Hi],
1 AS [C2]
FROM ( SELECT
[Extent3].[Ch] AS [K1],
[Extent3].[Wa] AS [K2],
[Extent3].[Bu] AS [K3],
[Extent3].[Hi] AS [K4],
COUNT(1) AS [A1]
FROM [dbo].[Product] AS [Extent1]
INNER JOIN [dbo].[Si] AS [Extent2] ON [Extent1].[Id] = [Extent2].[ProductId]
INNER JOIN [dbo].[Pi] AS [Extent3] ON [Extent2].[Id] = [Extent3].[SiId]
WHERE ([Extent1].[Code] = @p__linq__0) AND (@p__linq__0 IS NOT NULL)
GROUP BY [Extent3].[Ch], [Extent3].[Wa], [Extent3].[Bu], [Extent3].[Hi]
) AS [GroupBy1]
) AS [Project1]
ORDER BY [Project1].[C1] DESC
I've also tried in query syntax with about the same result. I also tried (but not for very long) executing the original SQL query directly with EF, but couldn't quickly get it working.
Is there some mistake I'm doing in translating the query to LINQ? Is there an obvious way I'm missing to improve the query? Is it possible to write the query in EF / LINQ with the same performance as the SQL statements?
====== Update ======
In SQL profiler the output for the original query is exactly the same. For the LINQ query it is very similar to what I posted above.
exec sp_executesql N'SELECT TOP (1)
[Project1].[C2] AS [C1],
[Project1].[Ch] AS [Ch],
[Project1].[Wa] AS [Wa],
[Project1].[Bu] AS [Bu],
[Project1].[Hi] AS [Hi],
[Project1].[C1] AS [C2]
FROM ( SELECT
[GroupBy1].[A1] AS [C1],
[GroupBy1].[K1] AS [Ch],
[GroupBy1].[K2] AS [Wa],
[GroupBy1].[K3] AS [Bu],
[GroupBy1].[K4] AS [Hi],
1 AS [C2]
FROM ( SELECT
[Extent3].[Ch] AS [K1],
[Extent3].[Wa] AS [K2],
[Extent3].[Bu] AS [K3],
[Extent3].[Hi] AS [K4],
COUNT(1) AS [A1]
FROM [dbo].[Product] AS [Extent1]
INNER JOIN [dbo].[Si] AS [Extent2] ON [Extent1].[Id] = [Extent2].[ProductId]
INNER JOIN [dbo].[Pi] AS [Extent3] ON [Extent2].[Id] = [Extent3].[SiId]
WHERE ([Extent1].[Code] = @p__linq__0) AND (@p__linq__0 IS NOT NULL)
GROUP BY [Extent3].[Ch], [Extent3].[Wa], [Extent3].[Bu], [Extent3].[Hi]
) AS [GroupBy1]
) AS [Project1]
ORDER BY [Project1].[C1] DESC',N'@p__linq__0 nvarchar(4000)',@p__linq__0=N'109579'
====== Update 2 ======
Here's the obfuscated XML output of the query execution plan on Snipt.org. Note the variable in question here is named "MagicalCode" in the output and both values "109579" and "2449-268-550" are valid (strings in C#) as in the final line of the XML output.
<ParameterList>
<ColumnReference
Column="@p__linq__0"
ParameterCompiledValue="N'109579'"
ParameterRuntimeValue="N'2449-268-550'" />
</ParameterList>
Plan image with actual row counts displayed
====== Update 3 ======
(hidden in a comment) I ran the EF generated SQL from entity framework in SSMS and it ran instantly. So I might be suffering from some form of parameter sniffing as hinted by this question. I'm not sure how to deal with it in the context of entity framework.
====== Update 4 ======
Updated Entity Framework SQL Execution Plan and SSMS SQL Query Execution Plan that can be opened with Plan Explorer.
====== Update 5 ======
Some workaround attempts
- Running the original query using
context.Database.SqlQuery<ReturnObject>(...)
ran in ~4-5 seconds. - Running the original query using
SqlCommand
and the connection string obtained from EF context took about 3 seconds (context initialization overhead). - Running the original query using
SqlCommand
took with hardcoded connection string takes about 1.5 seconds. So I ended up using the last one for now. The last thing I can think of is writing a stored procedure to get closer to the "instant" performance of running the query in SSMS.