If we pass an array with only two values int[] ids = {1, 2}
to your method GetEntities
EntityFramework will generate the next query:
SELECT
[Extent1].[Id] AS [Id],
...
FROM [dbo].[Entity] AS [Extent1]
WHERE ( NOT EXISTS (SELECT
1 AS [C1]
FROM (SELECT
1 AS [C0]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
UNION ALL
SELECT
1 AS [C0]
FROM ( SELECT 1 AS X ) AS [SingleRowTable2]) AS [UnionAll1]
)) OR (1 = [Extent1].[Id]) OR (2 = [Extent1].[Id])
If we increase the number of elements in ids
array this query becomes more complex with more levels of nestings. I think that EntityFramework uses some recursive algorithm to generate SQL-code for !ids.Any()
expression. When the number of elements in ids
array increases the depth of the recursion also increases. Therefore it generates StackOverflowException
when the number of elements in ids
array (and also the depth of the recursion) is large.
If we delete !ids.Any()
expression the next query will be generated:
SELECT
[Extent1].[Id] AS [Id],
...
FROM [dbo].[Entity] AS [Extent1]
WHERE [Extent1].[Id] IN (1,2)
Such query does not generate StackOverflowException
when the number of elements in the ids
array is large. Therefore it would be better to extract !ids.Any()
expression out of LINQ query:
public List<TEntity> GetEntities<TEntity>(int[] ids)
{
var someDbSet = new DbSet<TEntity>();
if (!ids.Any())
return someDbSet.ToList();
var resultQ = someDbSet.Where(t => ids.Contains(t.ID));
return resultQ.toList();
}
You should also take to account that there is a limitation on number of items for WHERE IN
condition: Limit on the WHERE col IN (...) condition.
ionutnespus wrote:
Yes, extracting the condition outside Where() is working. Still, I
couldn't find any explanation why EF would use such a complicated
algorithm for such a simple condition. Any thoughts on that?
I've decided to answer this question by extending this post because the asnwer is large and contains code.
I don't know for sure why EF generates such complex query but I've made some research and here are my thoughts. If we modify your GetEntites
method and use next condition in LINQ query:
someDbSet.Where(t => !ids.Any(i => i == 3) || ids.Contains(t.ID));
the next SQL-query will be generated if ids = {1, 2}
:
SELECT
[Extent1].[Id] AS [Id],
...
FROM [dbo].[Entity] AS [Extent1]
WHERE ( NOT EXISTS (
SELECT 1 AS [C1]
FROM (
SELECT 1 AS [C0] FROM ( SELECT 1 AS X ) AS [SingleRowTable1] WHERE 3 = 1
UNION ALL
SELECT 1 AS [C0] FROM ( SELECT 1 AS X ) AS [SingleRowTable2] WHERE 3 = 2
) AS [UnionAll1]
)) OR (1 = [Extent1].[Id]) OR (2 = [Extent1].[Id])
Here you can see that NOT EXISTS
condition contains two subqueries each of which checks if the next element of the ids
array equals required value. I think that it is logical to use NOT EXISTS
SQL-condition to represent Any()
method. But why does EF generates one subquery for each array element? In my opinion EF does so because because EF Team tried to write algorithm that generates queries that are not depend on database type. But this is only my opinion. May be it would be better to ask this question EF Team on github.