130

Given a huge collection of objects, is there a performance difference between the the following?

Collection.Contains:

myCollection.Contains(myElement)

Enumerable.Any:

myCollection.Any(currentElement => currentElement == myElement)
Dan Lugg
  • 20,192
  • 19
  • 110
  • 174
SDReyes
  • 9,798
  • 16
  • 53
  • 92
  • 8
    A collection of 10'000.000 of int's. winner is the contains for 300%. but it's worthy to consider the variances mentioned below. – SDReyes Dec 15 '10 at 01:51
  • 2
    This seems to show a stark contrast between the two: http://thedailywtf.com/Articles/State-of-the-UNION.aspx – David Peterson Aug 18 '14 at 20:16

4 Answers4

171

Contains() is an instance method, and its performance depends largely on the collection itself. For instance, Contains() on a List is O(n), while Contains() on a HashSet is O(1).

Any() is an extension method, and will simply go through the collection, applying the delegate on every object. It therefore has a complexity of O(n).

Any() is more flexible however since you can pass a delegate. Contains() can only accept an object.

CarenRose
  • 1,266
  • 1
  • 12
  • 24
Etienne de Martel
  • 34,692
  • 8
  • 91
  • 111
  • 32
    `Contains` is also an extension method against `IEnumerable` (although some collections have their own `Contains` instance method too). As you say, `Any` is more flexible than `Contains` because you can pass it a custom predicate, but `Contains` *might* be slightly faster because it doesn't need to perform a delegate invocation for each element. – LukeH Dec 14 '10 at 23:45
  • 3
    Does **Any()** perform the operation on all objects in the collection or does it terminate with the first match? – Quark Soup Feb 14 '19 at 18:02
  • 3
    At least [according to the source](https://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,8788153112b7ffd0), it stops on the first match. `All()` operates similarly. – Etienne de Martel Feb 14 '19 at 18:44
15

It depends on the collection. If you have an ordered collection, then Contains might do a smart search (binary, hash, b-tree, etc.), while with `Any() you are basically stuck with enumerating until you find it (assuming LINQ-to-Objects).

Also note that in your example, Any() is using the == operator which will check for referential equality, while Contains will use IEquatable<T> or the Equals() method, which might be overridden.

CarenRose
  • 1,266
  • 1
  • 12
  • 24
tster
  • 17,883
  • 5
  • 53
  • 72
  • 6
    With .Any you can easily compare properties. With .Contains you can just compare objects and you need an extra IEqualityComparer to compare properties. – msfanboy Feb 04 '11 at 20:23
  • 1
    @msfanboy: That's true, but the question was specifically about performance and showed comparing the whole object. So I don't think that it is relevant here. – tster Feb 04 '11 at 20:35
4

I suppose that would depend on the type of myCollection is which dictates how Contains() is implemented. If a sorted binary tree for example, it could search smarter. Also it may take the element's hash into account. Any() on the other hand will enumerate through the collection until the first element that satisfies the condition is found. There are no optimizations for if the object had a smarter search method.

Jeff Mercado
  • 129,526
  • 32
  • 251
  • 272
0

Contains() is also an extension method which can work fast if you use it in the correct way. For ex:

var result = context.Projects.Where(x => lstBizIds.Contains(x.businessId)).Select(x => x.projectId).ToList();

This will give the query

SELECT Id FROM Projects INNER JOIN (VALUES (1), (2), (3), (4), (5)) AS Data(Item) ON Projects.UserId = Data.Item

while Any() on the other hand always iterate through the O(n).

Hope this will work....

Uwais
  • 27
  • 1
  • 5
  • 3
    In case you read this; You are conflating LINQ and "Linq To SQL". And where this discussion mainly centers around `IEnumerable`, your answer is about `IQueryable`. I didn't review it for correctness, but seems misplaced. – Suamere Mar 16 '21 at 16:07