181

I have a list containing Id's of my UserProfile table. How can i select all UserProfiles based on the list of Id's i got in a var using LINQ?

var idList = new int[1, 2, 3, 4, 5];
var userProfiles = _dataContext.UserProfile.Where(......);

I got stuck right here. I can do this using for loops etc. But I'd rather do this with LINQ.

Yustme
  • 6,125
  • 22
  • 75
  • 104
  • 4
    searching and finding are 2 different things. But since you can look over my shoulder though the internet, could you tell me how you know i didn't search? wait don't tell! You saw it right? my point exactly. – Yustme May 29 '13 at 22:03
  • 6
    asking a question costs more time than doing a search. next time just assume 'he/she' did a search or 10. – Yustme May 30 '13 at 06:54
  • 2
    This still gets quite a bit of attention, so I thought I would mention that ReSharper does a very good job of suggesting places where you could turn iterative code into LINQ statements. For people new to LINQ it can be an indispensable tool to have for this purpose alone. – Yuck Jun 12 '14 at 17:31

4 Answers4

288

You can use Contains() for that. It will feel a little backwards when you're really trying to produce an IN clause, but this should do it:

var userProfiles = _dataContext.UserProfile
                               .Where(t => idList.Contains(t.Id));

I'm also assuming that each UserProfile record is going to have an int Id field. If that's not the case you'll have to adjust accordingly.

Yuck
  • 49,664
  • 13
  • 105
  • 135
  • Hi, yes the userprofile records contains id's. So somehow i would be doing something like t => t.id == idList.Contains(id) ? – Yustme May 29 '13 at 21:59
  • `Contains()` will handle that equality check on each `id` value if you use it as I've written in the answer. You don't have to explicitly write `==` anywhere when you're trying to compare the items of one set (the array) against another (the database table). – Yuck May 29 '13 at 22:00
  • Well the problem is that t holds the whole object of UserProfile, and the idList only contains int's. The compiler complained about something but i've managed to fix it. Thanks. – Yustme May 29 '13 at 22:05
  • 3
    @Yuck - Doesnt work for me , Says Function timed out ! Have disabled Lazy loading but still fails . – bhuvin Jul 18 '14 at 06:25
  • 1
    I get "Cannot convert lambda expression to type 'int' because it is not a delegate type". How to fix that? – Stian Aug 12 '19 at 18:22
126

Solution with .Where and .Contains has complexity of O(N square). Simple .Join should have a lot better performance (close to O(N) due to hashing). So the correct code is:

_dataContext.UserProfile.Join(idList, up => up.ID, id => id, (up, id) => up);

And now result of my measurement. I generated 100 000 UserProfiles and 100 000 ids. Join took 32ms and .Where with .Contains took 2 minutes and 19 seconds! I used pure IEnumerable for this testing to prove my statement. If you use List instead of IEnumerable, .Where and .Contains will be faster. Anyway the difference is significant. The fastest .Where .Contains is with Set<>. All it depends on complexity of underlying coletions for .Contains. Look at this post to learn about linq complexity.Look at my test sample below:

    private static void Main(string[] args)
    {
        var userProfiles = GenerateUserProfiles();
        var idList = GenerateIds();
        var stopWatch = new Stopwatch();
        stopWatch.Start();
        userProfiles.Join(idList, up => up.ID, id => id, (up, id) => up).ToArray();
        Console.WriteLine("Elapsed .Join time: {0}", stopWatch.Elapsed);
        stopWatch.Restart();
        userProfiles.Where(up => idList.Contains(up.ID)).ToArray();
        Console.WriteLine("Elapsed .Where .Contains time: {0}", stopWatch.Elapsed);
        Console.ReadLine();
    }

    private static IEnumerable<int> GenerateIds()
    {
       // var result = new List<int>();
        for (int i = 100000; i > 0; i--)
        {
            yield return i;
        }
    }

    private static IEnumerable<UserProfile> GenerateUserProfiles()
    {
        for (int i = 0; i < 100000; i++)
        {
            yield return new UserProfile {ID = i};
        }
    }

Console output:

Elapsed .Join time: 00:00:00.0322546

Elapsed .Where .Contains time: 00:02:19.4072107

Community
  • 1
  • 1
David Gregor
  • 1,855
  • 1
  • 13
  • 14
  • 4
    Can you back that up with numbers? – Yustme Nov 03 '14 at 17:24
  • Nice, does however make me curious what the timings would be when `List` is used. +1 – Yustme Nov 04 '14 at 22:57
  • Ok, here are the timings you are interested in: List took 13.1 seconds and HashSet took 0,7 ms! So the .Where .Contains is best only in case of HashSet (when .Contains has complexity O(1)). In other cases, the .Join is better – David Gregor Nov 05 '14 at 14:08
  • 8
    I get `Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator.` error when using LINQ2SQL datacontext. – Mayank Raichura Jan 30 '16 at 00:20
  • 2
    This answer will not likely give similar O(N * N) vs O(N) improvements in Linq2Sql or Linq2Entities which is what the question/example he is asking. When querying against a database, Linq generates SQL with a "join" or an "in()" clause automatically so it is not O(N*N). – crokusek Aug 27 '16 at 00:16
  • 5
    @Yustme - performance is _always_ a consideration. (I hate to be the "this should be the accepted answer" guy, but...) – jleach Jan 07 '17 at 12:33
  • 1
    Wow, thanks a lot for that answer. Actually it should be clear that joining the needed data is more efficient than comparing two lists by iterating again and again... But sometimes one needs time to get the idea and the correct syntax. Thank you for that – seawave_23 Aug 08 '20 at 13:50
  • 1
    @crokusek 's comment ended up true for me. `Where(e => listEmpID.Contains(e.ID))` turned into `WHERE [e].[ID] IN (1, 2, 3)` with Entity Framework. I guess using Contains is fine in this case. – andromeda947 Oct 22 '21 at 19:36
  • @andromeda947 Be careful with your solution. In case listEmpID has many items, it will generate a very long IN clause and can hit the limit of your server. Microsoft gives a bit more info about this https://learn.microsoft.com/en-us/sql/t-sql/language-elements/in-transact-sql?redirectedfrom=MSDN&view=sql-server-ver15 – David Gregor Mar 22 '22 at 10:31
35

Nice answers abowe, but don't forget one IMPORTANT thing - they provide different results!

  var idList = new int[1, 2, 2, 2, 2]; // same user is selected 4 times
  var userProfiles = _dataContext.UserProfile.Where(e => idList.Contains(e)).ToList();

This will return 2 rows from DB (and this could be correct, if you just want a distinct sorted list of users)

BUT in many cases, you could want an unsorted list of results. You always have to think about it like about a SQL query. Please see the example with eshop shopping cart to illustrate what's going on:

  var priceListIDs = new int[1, 2, 2, 2, 2]; // user has bought 4 times item ID 2
  var shoppingCart = _dataContext.ShoppingCart
                     .Join(priceListIDs, sc => sc.PriceListID, pli => pli, (sc, pli) => sc)
                     .ToList();

This will return 5 results from DB. Using 'contains' would be wrong in this case.

JTech
  • 3,420
  • 7
  • 44
  • 51
Tomino
  • 5,969
  • 6
  • 38
  • 50
15

That should be simple. Try this:

var idList = new int[1, 2, 3, 4, 5];
var userProfiles = _dataContext.UserProfile.Where(e => idList.Contains(e));
Diogo Cid
  • 3,764
  • 1
  • 20
  • 25
Fabian Bigler
  • 10,403
  • 6
  • 47
  • 70