Determine if a sequence contains all elements of another sequence using Linq

Question

Given two sets of values:

var subset = new[] { 2, 4, 6, 8 };

var superset = new[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

how do I determine if superset contains all elements of subset?

I have come up with this:

superset.Intersect(subset).Count() == subset.Count()

Is this the most logical and efficient method?

score 199 · Accepted Answer · answered Jan 02 '09 at 19:09

199

Count? How about Not Any?

bool contained = !subset.Except(superset).Any();

answered Jan 02 '09 at 19:09

Amy B

108,202
21
135
185

12

+1 Any() is much more efficient than Count() – JaredPar Jan 02 '09 at 19:17
1

Much better than count. This would also stop on the first non-match found, if I'm not mistaken. – configurator Jan 02 '09 at 21:21

Amy B · Answer 2 · 2009-01-02T21:42:23.543

34

So, my other answer was pretty easy to use. But it's an O(n*m) solution.

Here's a slightly less friendly O(n+m) solution. This should be used if superset is HUGE. It avoids repeatedly enumerating superset.

HashSet<int> hashSet = new HashSet<int>(superset);
bool contained = subset.All(i => hashSet.Contains(i));

edited Jan 02 '09 at 21:42

answered Jan 02 '09 at 21:35

Amy B

108,202
21
135
185

why is superset repeatedly enumerated in the other solution? https://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,899 – user1121956 Oct 22 '21 at 13:13
@user1121956 I had assumed a different tradeoff, but your link shows I was wrong. – Amy B Oct 22 '21 at 14:30
Interestingly enough the .net core version is based on HashSet so your solution from this answer is actually similar to the implementation of HashSet's IsSuperSetOf. Someone interested could compare the performance of Hashset's IsSuperSetOf and IsSubsetOf. https://source.dot.net/#System.Linq/System/Linq/Except.cs,e289e6c98881b2b8,references https://source.dot.net/#System.Private.CoreLib/HashSet.cs,02e583c455a1c955,references https://source.dot.net/#System.Private.CoreLib/HashSet.cs,bb79d6d73e1765e7,references – user1121956 Oct 23 '21 at 10:20

score 14 · Answer 3 · answered Nov 09 '11 at 12:54

14

I have an extension method that uses the existing Contains()-method. I find it more intuitive than using Instersect() or Except().

public static bool ContainsAll<T>(this IEnumerable<T> source, IEnumerable<T> values)
{
    return values.All(value => source.Contains(value));
}

answered Nov 09 '11 at 12:54

Anders

734
8
12

That will potentially process all values of `source` for each `value`, which could be quite costly. The `Except` solution only processes each source and target value once, making it a more efficient solution overall. – Bryan Watts Nov 09 '11 at 14:36
@BryanWatts, I haven't looked "under the hood" to see how LINQ might be optimizing these different query methods. But if we imagine the most simple implementaion, aren't the two solutions quite equal? The Except-solution would need to iterate all "subset"-elements, and iterate "superset" for each of those values (unless LINQ somehow optimizes it). – Anders Nov 18 '11 at 13:24
1

Semantically they are equal: they will produce the same output for the same input. However, the LINQ implementation of `Except` iterates the source sequence once, stores the values (I think as a hash set), then iterates the target sequence once, removing its items from the set. LINQ is heavily optimized for minimum iteration. – Bryan Watts Nov 18 '11 at 14:07
I see. And I guess it might be generally easier for LINQ to optimize "pure" queries, without Lambda expressions in them. Still like the readability of mine though :-) – Anders Nov 21 '11 at 16:53
LINQ generally optimizes for minimum iteration even for methods with lambda expressions, such as `OrderBy`. Now, what you do inside those lambda expressions could be highly inefficient, but that is on you :-) I like the readability of yours too; I am considering adding your extension method with the `Except` implementation to get the best of both worlds. – Bryan Watts Nov 21 '11 at 17:00
Best of Both Worlds: `public static bool ContainsAll(this IEnumerable haystack, IEnumerable needles) { return !needles.Except(haystack).Any(); }` – Felix Alcala Nov 16 '12 at 23:16
This ContainsAll method should be called 'ContainsEvery' if the sets are different length it will return `false` – superlogical Nov 22 '13 at 00:13

score 4 · Answer 4 · answered Jan 02 '09 at 19:07

4

You could use Except and the resulting count should be 0.

Read up on MSDN for details of the parameters.

Example:

subset.Except(superset).Count() == 0

answered Jan 02 '09 at 19:07

leppie

115,091
17
196
297

8

It's a lot more efficient to do !Any() vs. Count() == 0. Count() will walk the entire enumerable while Any() will just look for the first element. – JaredPar Jan 02 '09 at 19:16

Determine if a sequence contains all elements of another sequence using Linq

4 Answers4

Linked

Related