118

Given two sets of values:

var subset = new[] { 2, 4, 6, 8 };

var superset = new[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

how do I determine if superset contains all elements of subset?

I have come up with this:

superset.Intersect(subset).Count() == subset.Count()

Is this the most logical and efficient method?

Bryan Watts
  • 44,911
  • 16
  • 83
  • 88

4 Answers4

199

Count? How about Not Any?

bool contained = !subset.Except(superset).Any();
Amy B
  • 108,202
  • 21
  • 135
  • 185
34

So, my other answer was pretty easy to use. But it's an O(n*m) solution.

Here's a slightly less friendly O(n+m) solution. This should be used if superset is HUGE. It avoids repeatedly enumerating superset.

HashSet<int> hashSet = new HashSet<int>(superset);
bool contained = subset.All(i => hashSet.Contains(i));
Amy B
  • 108,202
  • 21
  • 135
  • 185
  • why is superset repeatedly enumerated in the other solution? https://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,899 – user1121956 Oct 22 '21 at 13:13
  • @user1121956 I had assumed a different tradeoff, but your link shows I was wrong. – Amy B Oct 22 '21 at 14:30
  • Interestingly enough the .net core version is based on HashSet so your solution from this answer is actually similar to the implementation of HashSet's IsSuperSetOf. Someone interested could compare the performance of Hashset's IsSuperSetOf and IsSubsetOf. https://source.dot.net/#System.Linq/System/Linq/Except.cs,e289e6c98881b2b8,references https://source.dot.net/#System.Private.CoreLib/HashSet.cs,02e583c455a1c955,references https://source.dot.net/#System.Private.CoreLib/HashSet.cs,bb79d6d73e1765e7,references – user1121956 Oct 23 '21 at 10:20
14

I have an extension method that uses the existing Contains()-method. I find it more intuitive than using Instersect() or Except().

public static bool ContainsAll<T>(this IEnumerable<T> source, IEnumerable<T> values)
{
    return values.All(value => source.Contains(value));
}
Anders
  • 734
  • 8
  • 12
  • That will potentially process all values of `source` for each `value`, which could be quite costly. The `Except` solution only processes each source and target value once, making it a more efficient solution overall. – Bryan Watts Nov 09 '11 at 14:36
  • @BryanWatts, I haven't looked "under the hood" to see how LINQ might be optimizing these different query methods. But if we imagine the most simple implementaion, aren't the two solutions quite equal? The Except-solution would need to iterate all "subset"-elements, and iterate "superset" for each of those values (unless LINQ somehow optimizes it). – Anders Nov 18 '11 at 13:24
  • 1
    Semantically they are equal: they will produce the same output for the same input. However, the LINQ implementation of `Except` iterates the source sequence once, stores the values (I think as a hash set), then iterates the target sequence once, removing its items from the set. LINQ is heavily optimized for minimum iteration. – Bryan Watts Nov 18 '11 at 14:07
  • I see. And I guess it might be generally easier for LINQ to optimize "pure" queries, without Lambda expressions in them. Still like the readability of mine though :-) – Anders Nov 21 '11 at 16:53
  • LINQ generally optimizes for minimum iteration even for methods with lambda expressions, such as `OrderBy`. Now, what you do inside those lambda expressions could be highly inefficient, but that is on you :-) I like the readability of yours too; I am considering adding your extension method with the `Except` implementation to get the best of both worlds. – Bryan Watts Nov 21 '11 at 17:00
  • Best of Both Worlds: `public static bool ContainsAll(this IEnumerable haystack, IEnumerable needles) { return !needles.Except(haystack).Any(); }` – Felix Alcala Nov 16 '12 at 23:16
  • This ContainsAll method should be called 'ContainsEvery' if the sets are different length it will return `false` – superlogical Nov 22 '13 at 00:13
4

You could use Except and the resulting count should be 0.

Read up on MSDN for details of the parameters.

Example:

subset.Except(superset).Count() == 0
leppie
  • 115,091
  • 17
  • 196
  • 297
  • 8
    It's a lot more efficient to do !Any() vs. Count() == 0. Count() will walk the entire enumerable while Any() will just look for the first element. – JaredPar Jan 02 '09 at 19:16