Subset pattern implementation

Question

I am trying to write an implementation on C# of Subsets pattern read here 14 Patterns to Ace Any Coding Interview Question:

It looks obvious but confuses me. My research says me it should be implemented via Jagged Arrays (not Multidimensional Arrays). I started:

int[] input = { 1, 5, 3 };
int[][] set = new int[4][];
// ...

Could someone help with 2, 3 and 4 steps?

This is a recursive algorithm. Don't worry about the arrays. Worry about the algorithm. — jdweng, May 27 '21 at 14:20

D M · Accepted Answer · 2021-05-27T15:05:13.987

The instructions provided seem to lend themselves more to a c++ style than a C# style. I believe there are better ways than manually building arrays to get a list of subsets in C#. That said, here's how I would go about implementing the instructions as they are written.

To avoid having to repeatedly grow the array of subsets, we should calculate its length before we allocate it.

Assuming n elements in the input, we can determine the number of possible subsets by adding:

All subsets with 0 elements (the empty set)
All subsets with 1 element
All subsets with 2 elements
...
All subsets with n-1 elements
All subsets with n elements (the set itself)

Mathematically, this is the summation of the binomial coefficient. We take the sum from 0 to n of n choose k which evaluates to 2^n.

The jagged array should then contain 2^n arrays whose length will vary from 0 to n.

var input = new int[] { 1, 3, 5 };

var numberOfSubsets = (int)Math.Pow(2, input.Length);

var subsets = new int[numberOfSubsets][];

As the instructions in your article state, we start by adding the empty set to our list of subsets.

int nextEmptyIndex = 0;

subsets[nextEmptyIndex++] = new int[0];

Then, for each element in our input, we record the end of the existing subsets (so we don't end up in an infinite loop chasing the new subsets we will be adding) and add the new subset(s).

foreach (int element in input)
{
    int stopIndex = nextEmptyIndex - 1;

    // Build a new subset by adding the new element
    // to the end of each existing subset.
    for (int i = 0; i <= stopIndex; i++)
    {
        int newSubsetLength = subsets[i].Length + 1;
        int newSubsetIndex = nextEmptyIndex++;
        
        // Allocate the new subset array.
        subsets[newSubsetIndex] = new int[newSubsetLength];

        // Copy the elements from the existing subset.
        Array.Copy(subsets[i], subsets[newSubsetIndex], subsets[i].Length);
        
        // Add the new element at the end of the new subset.
        subsets[newSubsetIndex][newSubsetLength - 1] = element;
    }

}

With some logging at the end, we can see our result:

for (int i = 0; i < subsets.Length; i++)
{
    Console.WriteLine($"subsets[{ i }] = { string.Join(", ", subsets[i]) }");
}

subsets[0] = 
subsets[1] = 1
subsets[2] = 3
subsets[3] = 1, 3
subsets[4] = 5
subsets[5] = 1, 5
subsets[6] = 3, 5
subsets[7] = 1, 3, 5

Try it out!

Although I *feel* this approach doesn't match the question as it's asked, it is undoubtedly the way I would solve this. Great breakdown! If you want to amp the speed up even more with this , consider using bit shifting for the array size instead of `Math.Pow` and use direct memory management for the arrays such as `Span` and `Memory`! — DekuDesu, May 27 '21 at 15:24
@DekuDesu Good points, thanks! Here's a discussion about [how `Math.Pow()` is implemented in `C#`](https://stackoverflow.com/q/8870442/14956277) and why you might want to [use bit shifting instead of `Math.Pow()`](https://stackoverflow.com/a/47417271/14956277) (especially in a case like this where both arguments are integers and the base is `2`. — D M, May 27 '21 at 16:44
I am unfamiliar with using `Span` but its [introduction documentation](https://github.com/dotnet/corefxlab/blob/archive/docs/specs/span.md) suggests that it has "performance characteristics on par with `T[]`". Could you elaborate on how it would be useful in this case? Feel free to edit it into a supplemental information section at the bottom of my answer. — D M, May 27 '21 at 16:48
So I just spent the last hour trying to optimize your solution using `Span` and only got a ~2% increase in performance. I am immensely disappointed and my day is ruined. 0/10. In all seriousness however, the increase of performance in span is purely based of off size of data, how you access that data, and the general implementation(because the way the CPU cache works and `managed` memory overhead). It appears this particular problem **would not** benefit from `Span`(it does, but not worth it). [Here's the fiddle](https://dotnetfiddle.net/r1N4Av) if you want to play with it. — DekuDesu, May 27 '21 at 18:03

DekuDesu · Answer 2 · 2021-05-27T14:44:42.137

What I find easiest is translating the problem from a word problem into a more logical one.

Start with an empty set : [[]]

So the trick here is that this word problem tells you to create an empty set but immediately shows you a set that contains an element.

If we break this down into arrays instead(because I personally find it more intuitive) we can translate it to:

Start with an array of arrays, who's first element is an empty array. (instead of null)

So basically

int[][] result = new int[]{ new int[0] };

Now we have somewhere to start from, we can start to translate the other parts of the word problem.

Add the First Number (1) to all existing subsets to create subsets: [[],[1]]
Add the Second Number (5) to all existing subsets ...
Add the Third Number (3) to all existing subsets ...

There's a lot of information here. Let's translate different parts

Add the 1st Number ...
Add the 2nd Number ...
Add the nth Number ...

The repetition of these instructions and the fact that each number 1, 5, 3 matches our starting set of {1, 5, 3} tells us we should use a loop of some kind to build our result.

for(int i = 0; i < set.Length; i++)
{
    int number = set[i];

    // add subsets some how
}

Add the number to all existing subsets to create subsets: [[],[1]

A couple things here stand out. Notice they used the word Add but provide you an example where the number wasn't added to one of the existing subsets [[]] turned into [[],[1]]. Why is one of them still empty if we added 1 to all of them?

The reason for this is because when we create the new subsets and all their variations, we want to keep the old ones. So we do add the 1 to [](the first element) but we make a copy of [] first. That way when we add 1 to that copy, we still have the original [] and now a brand new [1] then we can combine them to create [[],[1]].

Using these clues we can decipher that Add the number to all existing subsets, actually means make copies of all existing subsets, add the number to each of the copies, then add those copies at the end of the result array.

int[][] result = new int[]{ new int[0] };

int[] copy = result[0];

copy.Append(1); // pseudo code

result.Append(copy); // pseudo code

// result: [[],[1]]

Let's put each of those pieces together and put together the final solution, hopefully!

Here's an example that I threw together that works(at least according to your example data).

object[] set = { 1, 5, 3 };

// [null]
object[][] result = Array.Empty<object[]>();

// add a [] to the [null] creating [[]]
Append(ref result, Array.Empty<object>());

// create a method so we can add things to the end of an array
void Append<T>(ref T[] array, T SubArrayToAdd)
{
    int size = array.Length;

    Array.Resize(ref array, size + 1);

    array[size] = SubArrayToAdd;
}

// create a method that goes through all the existing subsets and copies them, adds the item, and adds those copies to the result array
void AddSubsets(object item)
{
    // store the length of the result because if we don't we will infinitely expand(because we resize the array)
    int currentLength = result.Length;

    for (int i = 0; i < currentLength; i++)
    {
        // copy the array so we don't change the original
        object[] previousItemArray = result[i];  // []
       
        // add the item to it
        Append(ref previousItemArray, item);     // [1]

        // add that copy to the results
        Append(ref result, previousItemArray);   // [[]] -> [[],[1]]
    }
}

// Loop over the set and add the subsets to the result
for (int i = 0; i < set.Length; i++)
{
    object item = set[i];

    AddSubsets(item);
}

Subset pattern implementation

2 Answers2