3

I have a situation where I have generated a number of lists which contain integer values. However, the number of these lists is only known at runtime, and integers present in the resulting list must exist in all lists. Is there a method of joining all of these lists into a single list?

i.e.

List<int> l1 = {1, 2, 3, 4};
List<int> l2 = {2, 3, 5, 7, 9};
List<int> l3 = {3, 9, 10};
List<int> ln = {....};

The resulting list should be as follows

List<int> r = {3};

Is this possible with linq or any other methods?

Ishmael
  • 30,970
  • 4
  • 37
  • 49
Ken Cowley
  • 73
  • 8
  • 3
    Are the lists in an array or a list or similar? You've shown them being defined at compile time, which doesn't agree with your question... – RB. Feb 20 '13 at 16:27
  • Also, does it have to be done in LINQ? Is this something where you are trying to avoid the records being pulled back from a DB or are the results being processed in memory on the local system? – AJ Henderson Feb 20 '13 at 16:28
  • See: http://stackoverflow.com/questions/3191810/linq-intersect-multiple-lists-some-empty for similar question. – dugas Feb 20 '13 at 16:32
  • Ken, where you get those lists from? – Sergey Berezovskiy Feb 20 '13 at 16:37
  • Each list is comprised of a sequence of primary keys returned from a either a UDF or another system. This is the reason it's not all done on the DB. The number of querys run is variable so the number of lists is only know after all results are found – Ken Cowley Feb 20 '13 at 20:09

3 Answers3

5

I assume that you have a List<List<int>> that holds a variable number of List<int>.

You can intersect the first list with the second list

var intersection = listOfLists[0].Intersect(listOfLists[1]);

and then intersect the result with the third list

intersection = intersection.Intersect(listOfLists[2]);

and so on until intersection holds the intersection of all lists.

intersection = intersection.Intersect(listOfLists[listOfLists.Count - 1]);

Using a for loop:

IEnumerable<int> intersection = listOfLists[0];

for (int i = 1; i < listOfLists.Count; i++)
{
    intersection = intersection.Intersect(listOfLists[i]);
}

Using a foreach loop (as shown by @lazyberezovsky):

IEnumerable<int> intersection = listOfLists.First();

foreach (List<int> list in listOfLists.Skip(1))
{
    intersection = intersection.Intersect(list);
}

Using Enumerable.Aggregate:

var intersection = listOfLists.Aggregate(Enumerable.Intersect);

If order is not important, then you can also use a HashSet<T> that you fill with the first list and intersect with with the remaining lists (as shown by @Servy).

var intersection = new HashSet<int>(listOfLists.First());

foreach (List<int> list in listOfLists.Skip(1))
{
    intersection.IntersectWith(list);
}
dtb
  • 213,145
  • 36
  • 401
  • 431
  • While all is true, this doesn't answer the (unclear) "however the number of these lists is only known at runtime" part of the question – ken2k Feb 20 '13 at 16:30
  • It should be fairly easy to keep the intersection variable outside the loop, then you can (presumably) iterate through the lists - edit: as per lazyberezovsky's answer – penguat Feb 20 '13 at 16:32
  • 2
    I like the `Aggregate` version. – Rawling Feb 20 '13 at 16:54
  • Thanks for the info will give this a try first thing in the morning – Ken Cowley Feb 20 '13 at 20:10
1
// lists is a sequence of all lists from l1 to ln
if (!lists.Any())
   return new List<int>();

IEnumerable<int> r = lists.First();   

foreach(List<int> list in lists.Skip(1))    
   r = r.Intersect(list);

return r.ToList();
Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459
  • This doesn't answer the (unclear) "however the number of these lists is only known at runtime" part of the question – ken2k Feb 20 '13 at 16:34
  • @ken2k that's because it's unclear from question how those lists are retrieved. I believe OP can enumerate over them, but I'll add a comment to question – Sergey Berezovskiy Feb 20 '13 at 16:37
0

Here is a simple method to get the intersection of a collection of collections:

public static IEnumerable<T> Intersect<T>(IEnumerable<IEnumerable<T>> sequences)
{
    using (var iterator = sequences.GetEnumerator())
    {
        if (!iterator.MoveNext())
            return Enumerable.Empty<T>();

        HashSet<T> intersection = new HashSet<T>(iterator.Current);

        while (iterator.MoveNext())
            intersection.IntersectWith(iterator.Current);

        return intersection;
    }
}

The idea here is to put all of the items into a set, then intersect that set with each sequence in turn. We can do this with a lot less code using plain LINQ, but this will be populating a new HashSet from the results for each intersection, rather than reusing a single one, so it will have a lot higher overhead despite looking really elegant.

Here is the less performant but more elegant solution:

public static IEnumerable<T> Intersect<T>(IEnumerable<IEnumerable<T>> sequences)
{
    return sequences.Aggregate(Enumerable.Intersect);
}
Servy
  • 202,030
  • 26
  • 332
  • 449