14

I'm trying to write an algorithm to select all combinations of n values from a set of numbers.

For instance, given the set: 1, 2, 3, 7, 8, 9

All combinations of 2 values from the set is:

(1, 2), (1, 3), (1, 7), (1, 8), (1, 9), (2, 3), (2, 7), (2, 8), (2, 9), (3, 7), (3, 8), (3, 9), (7, 8), (7, 9), (8, 9)

And 3 is:

(1, 2, 3), (1, 2, 7), (1, 2, 8), (1, 2, 9), (1, 3, 7), (1, 3, 8), (1, 3, 9), (1, 7, 8), (1, 7, 9), (1, 8, 9), (2, 3, 7), (2, 3, 8), (2, 3, 9), (2, 7, 8), (2, 7, 9), (2, 8, 9), (3, 7, 8), (3, 7, 9), (3, 8, 9), (7, 8, 9)

etc!

I'm currently using methods to to yield return sets of combinations of 2, 3 and 4 values, but it seems to me this could be generalised in a LINQ query.

TylerH
  • 20,799
  • 66
  • 75
  • 101
Dink
  • 151
  • 1
  • 1
  • 7
  • 1
    Have you looked at : [this](http://stackoverflow.com/a/774628/1698987) or [this](http://stackoverflow.com/a/4326669/1698987) answers ? – Noctis Oct 26 '15 at 00:11

3 Answers3

34

Usage:

var results = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 }.DifferentCombinations(3);

Code:

public static class Ex
{
    public static IEnumerable<IEnumerable<T>> DifferentCombinations<T>(this IEnumerable<T> elements, int k)
    {
        return k == 0 ? new[] { new T[0] } :
          elements.SelectMany((e, i) =>
            elements.Skip(i + 1).DifferentCombinations(k - 1).Select(c => (new[] {e}).Concat(c)));
    }
}
Jesus is Lord
  • 14,971
  • 11
  • 66
  • 97
user109260
  • 878
  • 7
  • 22
  • 2
    @Dink I ran several tests and this one always outperformed the solution I provided. It's also more succinct. I'd go with this. – Jesus is Lord Oct 26 '15 at 00:47
  • 3
    Thank you user109260 and @Jared - this is great, it really simplifies things and is much more flexible. – Dink Oct 26 '15 at 02:25
  • 1
    This solution is recursive thus it works for small sets as the example provided in the question yet will break the stack for large sets. – David Burg Sep 17 '18 at 06:31
  • 1
    We can get a nice speed improvement by removing the memory allocations , e.g. return k == 0 ? EnumerableEx.Return(Enumerable.Empty()) : elements.SelectMany((e, i) => elements.Skip(i + 1).DifferentCombinations2(k - 1).Select(c => EnumerableEx.Return(e).Concat(c))); However the alternative solution is still faster for large combinations like (30, 10). – Phil Jan 19 '19 at 15:19
  • @Phil Nice, I like that better. I improved it to only use System.Linq methods: `return k == 0 ? Enumerable.Empty>() : elements.SelectMany((e, i) => elements.Skip(i + 1).DifferentCombinations(k - 1).Select(c => Enumerable.Repeat(e, 1).Concat(c)));` – Blaise Roth Jan 22 '19 at 00:44
  • @BlaiseRoth I tried that. Doesn't work for me, returns an empty collection. – Phil Jan 22 '19 at 08:05
  • 1
    Whoops, here's the working code with all System.Linq methods: `return k == 0 ? Enumerable.Repeat(Enumerable.Empty(), 1) : elements.SelectMany((e, i) => elements.Skip(i + 1).DifferentCombinations(k - 1).Select(c => Enumerable.Repeat(e, 1).Concat(c)));` – Blaise Roth Oct 04 '19 at 02:18
  • 1
    In net5 or upper `.Select(c => c.Append(e))` – Trương Quốc Khánh Oct 04 '22 at 12:38
5

Both answers are good but can be speeded up by eliminating memory allocations

For answer 1: Now 2.5x faster when calculating 5 from 60

Edit: EnumerableEx.Return is from the System.Interactive package.

public static IEnumerable<IEnumerable<T>> DifferentCombinations2<T>
    (this IEnumerable<T> elements, int k)
{
    return k == 0 
        ? EnumerableEx.Return(Enumerable.Empty<T>()) 
        : elements.SelectMany((e, i) => 
            elements.Skip(i + 1)
                .DifferentCombinations(k - 1)
                .Select(c => EnumerableEx.Return(e).Concat(c)));
}

Answer 2: Now 3x faster when calculating 5 from 60

static class Combinations
{
    private static void SetIndexes(int[] indexes, int lastIndex, int count)
    {
        indexes[lastIndex]++;
        if (lastIndex > 0 && indexes[lastIndex] == count)
        {
            SetIndexes(indexes, lastIndex - 1, count - 1);
            indexes[lastIndex] = indexes[lastIndex - 1] + 1;
        }
    }

    private static bool AllPlacesChecked(int[] indexes, int places)
    {
        for (int i = indexes.Length - 1; i >= 0; i--)
        {
            if (indexes[i] != places)
                return false;
            places--;
        }
        return true;
    }

public static IEnumerable<IEnumerable<T>> GetDifferentCombinations<T>(this IEnumerable<T> c, int count)
{
    var collection = c.ToList();
    int listCount = collection.Count();

    if (count > listCount)
        throw new InvalidOperationException($"{nameof(count)} is greater than the collection elements.");

    int[] indexes = Enumerable.Range(0, count).ToArray();

    do
    {
        yield return indexes.Select(i => collection[i]).ToList();

        SetIndexes(indexes, indexes.Length - 1, listCount);
    }
    while (!AllPlacesChecked(indexes, listCount));
}
}

This results in answer 2 being 5x faster than answer 1 for 5 from 60.

dyeo
  • 15
  • 4
Phil
  • 42,255
  • 9
  • 100
  • 100
3

Though the above answer is very neat I came up with a solution which can be much faster depending on the collection size.

static class Combinations
{
    private static void InitIndexes(int[] indexes)
    {
        for (int i = 0; i < indexes.Length; i++)
        {
            indexes[i] = i;
        }
    }

    private static void SetIndexes(int[] indexes, int lastIndex, int count)
    {
        indexes[lastIndex]++;
        if (lastIndex > 0 && indexes[lastIndex] == count)
        {
            SetIndexes(indexes, lastIndex - 1, count - 1);
            indexes[lastIndex] = indexes[lastIndex - 1] + 1;
        }
    }

    private static List<T> TakeAt<T>(int[] indexes, IEnumerable<T> list)
    {
        List<T> selected = new List<T>();
        for (int i = 0; i < indexes.Length; i++)
        {
            selected.Add(list.ElementAt(indexes[i]));
        }
        return selected;
    }

    private static bool AllPlacesChecked(int[] indexes, int places)
    {
        for (int i = indexes.Length - 1; i >= 0; i--)
        {
            if (indexes[i] != places)
                return false;
            places--;
        }
        return true;
    }

    public static IEnumerable<List<T>> GetDifferentCombinations<T>(this IEnumerable<T> collection, int count)
    {
        int[] indexes = new int[count];
        int listCount = collection.Count();
        if (count > listCount)
            throw new InvalidOperationException($"{nameof(count)} is greater than the collection elements.");
        InitIndexes(indexes);
        do
        {
            var selected = TakeAt(indexes, collection);
            yield return selected;
            SetIndexes(indexes, indexes.Length - 1, listCount);
        }
        while (!AllPlacesChecked(indexes, listCount));

    }
}
  • 1
    Agree with the alternative. Nice and small. The Linq-version above is slow and very memory consuming. I like this one better. However, when your input list has non-unique indexes, like e.g. { 0,0,1,2,3,3,4,5,2 } there are many doubles in the result. If the purpose is to keep the result subsets unique (reducing the output !), consider https://stackoverflow.com/questions/51668174/c-sharp-combination-process-time-too-slow-and-out-of-memory-exception/51675525 – Goodies Aug 04 '18 at 00:03