0

I have to iterate through a collection of objects (let's say ID's), and execute a specific query for each of these objects. For example:

IEnumerable<int> ids = getIDs(); //[1,2,3,4...]

Right now I have this solution:

DBEntities db = new DBEntities();

var results =
    from a in db.TABLEA
    join b in db.TABLEB on a.id equals b.id
    join c in db.TABLEC on b.oid equals c.oid
    where ids.Contains(c.id)
    select a; 

but keep in mind that the list of IDs is smaller than the table where I am searching. That being said, the solution above seems inefficient, since I am looking for each record of my table against a smaller list, when I wanted the opposite. I also do not want to iterate through the list, and execute the query for one element at a time.

Ideally, I would want something like this:

DBEntities db = new DBEntities();
(some data structure) ids = getIDs();

var results =
    from a in db.TABLEA
    join b in db.TABLEB on a.id equals b.id
    join c in db.TABLEC on b.oid equals c.oid
    join i in ids on c.id equals i.id;

The (pseudo-)code above would iterate my elements of the list, in a single query, doing so in a single query and performing my filter by each element of the list.

Is this the way to do it? If so, what is the appropriate data structure to implement this solution? If not, which alternatives do I have?

Manuel Reis
  • 574
  • 8
  • 29
  • 1
    It is very much different if this is Linq2Sql (executing against a database) or linq to object (running the query against in memory collections). – Magnus Aug 10 '15 at 21:13
  • 1
    If it's LINQ to SQL, check what SQL is generated, and look at the execution plan. It's likely more efficient than you think. – Blorgbeard Aug 10 '15 at 21:13
  • Sorry, it's linq to entities; perhaps it is easier to provide a more concrete example - let me edit this question – Manuel Reis Aug 10 '15 at 21:14
  • 1
    Check out this : http://stackoverflow.com/questions/857973/linq-to-entities-sql-in-clause – alexm Aug 10 '15 at 21:16

2 Answers2

4

Magnus's answer is true but not right :)

Technically you do have two options in newer versions of Entity Framework (and I discovered this by chance). Contains of course, but also Join.

Joining with a local sequence of primitive types has always been possible, but very quickly (after some tens of elements) raised a SqlException:

Some part of your SQL statement is nested too deeply. Rewrite the query or break it up into smaller queries.

EF tries to translate the local list to a temporary table in SQL. This is surprisingly non-trivial. It has to build the table by UNION-ing select statements that return 1 element each. This is what it used to look like with only 5 elements!

....
INNER JOIN  (SELECT 
    [UnionAll3].[C1] AS [C1]
    FROM  (SELECT 
        [UnionAll2].[C1] AS [C1]
        FROM  (SELECT 
            [UnionAll1].[C1] AS [C1]
            FROM  (SELECT 
                1 AS [C1]
                FROM  ( SELECT 1 AS X ) AS [SingleRowTable1]
            UNION ALL
                SELECT 
                2 AS [C1]
                FROM  ( SELECT 1 AS X ) AS [SingleRowTable2]) AS [UnionAll1]
        UNION ALL
            SELECT 
            3 AS [C1]
            FROM  ( SELECT 1 AS X ) AS [SingleRowTable3]) AS [UnionAll2]
    UNION ALL
        SELECT 
        4 AS [C1]
        FROM  ( SELECT 1 AS X ) AS [SingleRowTable4]) AS [UnionAll3]
UNION ALL
    SELECT 
    5 AS [C1]
    FROM  ( SELECT 1 AS X ) AS [SingleRowTable5]) AS [UnionAll4] ON ....

As you see, if you look closely, the UNION statements are nested. The nesting level soon becomes too deep, which made this approach practically useless.

However, currently the SQL looks like this:

....
INNER JOIN  (SELECT 
    1 AS [C1]
    FROM  ( SELECT 1 AS X ) AS [SingleRowTable1]
UNION ALL
    SELECT 
    2 AS [C1]
    FROM  ( SELECT 1 AS X ) AS [SingleRowTable2]
UNION ALL
    SELECT 
    3 AS [C1]
    FROM  ( SELECT 1 AS X ) AS [SingleRowTable3]
UNION ALL
    SELECT 
    4 AS [C1]
    FROM  ( SELECT 1 AS X ) AS [SingleRowTable4]
UNION ALL
    SELECT 
    5 AS [C1]
    FROM  ( SELECT 1 AS X ) AS [SingleRowTable5]) AS [UnionAll4] ON ....

Still not a beauty, but the nesting is replaced by chaining and the list can contain hundreds of elements.

But... (and this is why Magnus's answer is true), it doesn't perform well. A simple test with 2000 elements in the list took 2.5s with join and .25s with Contains. So there's still no practical case for joining with a local sequence.

Gert Arnold
  • 105,341
  • 31
  • 202
  • 291
3

If this is linq2Sql (or Linq2Entites) your only option is as in your example 1. You can not "join" a table with an in memory list. You have to use Contains. Which will be translated to an
Where c.id IN(2,3,4,5,...) SQL query

Magnus
  • 45,362
  • 8
  • 80
  • 118
  • I was somewhat confused by the fact that if I have a list of 2 elements and a table with 10M records, I must go through all table records against my smaller list (perhaps there are some optimizations I do not know about), instead of searching each element against my table (I would imagine something like a where clause where `WHERE id = list[1] OR id = list[2]`, which seems easier to optimize). – Manuel Reis Aug 10 '15 at 21:27
  • 1
    @ManuelReis it's translated into an `IN` statement in SQL, which is very efficient (until `ids` exceeds a couple of thousands of elements). – Gert Arnold Aug 10 '15 at 21:37
  • @GertArnold How about http://stackoverflow.com/a/9993948/2371337? Do you agree with this answer? I understand that 6 vs. 49 iterations is not something that affects performance greatly, but with more data, wouldn't it be worth it? (keeping in mind that i cannot join a table with an in-memory list) – Manuel Reis Aug 10 '15 at 21:54
  • I didn't know this answer, but in the mean time I happened to work on my own answer which says enough, I guess. – Gert Arnold Aug 10 '15 at 22:16