-2

I've seen several questions dealing with a similar problem, but none of the ones I've tried would work with my specific needs, even when I tried adapting them.

So I have a set of custom objects, that has anywhere between 1 and 30 objects. Many of those objects are the same (I only have like 8 types of objects), and no assumptions can be made as to how many of each will be there, some types might not be there are all.

I need to get all the combinations (unordered) of types of objects that can possibly be made with my set that have a size in a specific range. This means that I have a minimum and maximum, and I need to get all the subsets that have asize greater or equal to the minimum, and less or equal to the maximum.

This would be easy to do by just using one of the many algorithms that give me all the subsets, then removing the ones that don't fit the range. But my problem is that I would like to not calculate them at all for performance issues (most of the time, the range between maximum and minimum will be around 2).

Furthermore, also for performance reasons, I'd like to avoid all subsets that are the same because they use two versions of objects that are considered equal.

Examples (with integers instead of objects. You can use a bool AreSame(object1, object2) to check if they are the same:

Original set: {0, 1, 1, 5, 5, 6}
Minimum: 2
Maximum: 3
Expected result: {{0, 1}, {0, 5}, {0, 6}, {1, 1}, {1, 5}, {1, 6}, {5, 5}, {5, 6}, {0, 1, 1}, {0, 1, 5}, {0, 1, 6}, {0, 5, 5}, {0, 5, 6}, {1, 1, 5}, {1, 1, 6}, {1, 5, 5}, {1, 5, 6}, {5, 5, 6}}

Original set: {0, 1, 2}
Minimum: 1
Maximum: 4
Expected result: {{0}, {1}, {2}, {0, 1}, {0, 2}, {1, 2}, {0, 1, 2}}

EDIT: Currently, I have this modified version of the code given in this answer

public static List<List<Tile>> GetSubsets(List<Tile> set, int min, int max)
{
    List<List<Tile>> result = new List<List<Tile>>();
    int length = set.Count;
    int max = (int)Math.Pow(2, set.Count);

    for (int count = 0; count < max; count++)
    {
        List<Tile> subset = new List<Tile>();
        uint rs = 0;
        while (rs < length && subset.Count() <= max)
        {
            if ((count & (1u << (int)rs)) > 0)
            {
                subset.Add(list[(int)rs]);
            }
            rs++;
        }
        if (subset.Count() <= max && subset.Count() >= min && !isInSet(result, subset))
            result.Add(subset);
    }
    return result;
}

isInSet is basically a function making a linq query to know if the subset already exists using other objects that are equal.

Technically, this works. But in practice, the problem is really about performances. I have many objects that are the same, therefore this calculates way too many subsets. Then, I also compute a ton of subsets that I end up discarding. My problem is how to add those two concepts in the code in a way that would actually accelerate it, by avoiding as many operations as possible when subsets do not match my needs. I can't figure out how to do that.

Community
  • 1
  • 1
Kaito Kid
  • 983
  • 4
  • 15
  • 34
  • _"none of the ones I've tried would work with my specific needs, even when I tried adapting them"_ -- as stated, your question is too broad. You are asking Stack Overflow to write your implementation from scratch. Please provide a good [mcve] that shows your attempt that came closest to success, with respect to adapting one of the many existing answers that do exist already on Stack Overflow. Explain in detail how your particular scenario is not solved by that implementation, and what _specifically_ you are having trouble figuring out. – Peter Duniho May 13 '17 at 22:02
  • i've edited the question to add my attempt that works, since the problem is about making it faster. There is obviously, because of the nature of my needs, a huge number of operations done for nothing at the moment if, for example, I only want subsets of size 10-12 and have only three different types of objects in the original set of size 30. – Kaito Kid May 13 '17 at 22:33
  • does it kind of work if you use `new HashSet>(HashSet.CreateSetComparer());` instead of `new List>();` ? (`Tile` probably has to implement `IComparable`) – Slai May 14 '17 at 00:09

1 Answers1

1

You can reduce your problem to that of finding k-multicombinations of a multiset. That is, start with a procedure that generates only subsets of size k, then vary k from minimum to maximum. Having to restart the procedure for each k isn't optimal, but given your constaints (multiset of size 1 to 30, with 1 to 8 distinct objects), this will do just fine unless your base algorithm is wildly inefficient.

You can try translating this recursive algorithm or this iterative algorithm from Python to C# with some effort.

If you don't care about the order in which the subsets are generated, then you can tweak the linked recursive algorithm to directly generate combinations with bounded cardinality:

static IEnumerable<List<T>> MultiCombinations<T>(List<T> multiset, int min, int max)
{
    Debug.Assert(min >= 0 && max >= 0);

    var a = multiset.OrderBy(x => x).ToList();

    max = Math.Min(max, a.Count);

    if (min <= max)
        foreach (var combo in Combine(a, min, max, 0, new List<T>()))
            yield return combo;
}

private static IEnumerable<List<T>> Combine<T>(List<T> a, int min, int max, int i, List<T> combo)
{
    if (i < a.Count && combo.Count < max)
    {
        combo.Add(a[i]);

        foreach (var c in Combine(a, min, max, i + 1, combo))
            yield return c;

        combo.RemoveAt(combo.Count - 1);

        var j = IndexOfNextUniqueItem(a, i);

        foreach (var c in Combine(a, min, max, j, combo))
            yield return c;
    }
    else if (combo.Count >= min)
    {
        Debug.Assert(combo.Count <= max);
        yield return new List<T>(combo);
    }
}

private static int IndexOfNextUniqueItem<T>(List<T> a, int i)
{
    int j = i + 1;

    while (j < a.Count && a[j].Equals(a[i]))
        j++;

    return j;
}

This implementation avoids generating duplicates by sorting the original input, which requires your custom objects to be comparable. But the actual algorithm itself doesn't require sorting, just the ability to group together "like" or "equal" objects.

If you can guarantee that the input list of objects will always be grouped properly, then you can remove the sorting altogether. For example:

var groupedTiles = tilesList
    .GroupBy(tile => tile.VisibleFace)
    // or .GroupBy(tile => tile, tileEqualityComparer)
    .SelectMany(grp => grp)
    .ToList();

var combos = MultiCombinationsWithoutSorting(groupedTiles);
// ...

Here I assumed the visible face of your Tile objects have a proper .Equals() method, but you can also use one of the overloads of .GroupBy() and pass in a custom IEqualityComparer. You will also have to update the IndexOfNextUniqueItem() helper method to use whatever equality test you need.

Community
  • 1
  • 1
  • I'll try this as soon as I can. As for the comparable, I will implement it, because my objects are not literally the same (unlike for example, an int). They are tiles that can be turned around, and have two different "values" on the different sides. Being "equal" means "right now, the visible face shows the same thing", but the hidden face could be different, so I can't use your number trick. I'll accept the answer as soon as I'm able to confirm that it works. – Kaito Kid May 14 '17 at 13:55
  • 1
    @KaitoKid It sounds like implementing `IComparable` doesn't make sense for your use-case. Instead, you can replace the sorting operation with a group-by operation. I added a rough sketch of how you would do this to the answer. –  May 14 '17 at 21:57
  • I'll probably try both, as the tiles won't be turned around during a single round, so the double sided part is only there to add some randomness as to how many of each tiles are available for the round. I'll try this tomorrow and give you feedback. Thank you for your time – Kaito Kid May 15 '17 at 00:13