0

This is problem 9.4 from Cracking the Coding Interview 5th
The Problem: Write a method to return all the subsets of a set.

Here is my solution in Java.(tested it, it works!!!)

public static List<Set<Integer>> subsets(Set<Integer> s) {
    Queue<Integer> copyToProtectData  = new LinkedList<Integer>();
    for(int member: s) {
        copyToProtectData.add(member);
    }
    List<Set<Integer>> subsets = new ArrayList<Set<Integer>>();
    generateSubsets(copyToProtectData, subsets, new HashSet<Integer>());
    return subsets;
}
private static void generateSubsets(Queue<Integer> s, 
        List<Set<Integer>> subsets, Set<Integer> hashSet) {
    if(s.isEmpty()) {
        subsets.add(hashSet);
    } else {
        int member = s.remove();
        Set<Integer> copy = new HashSet<Integer>();
        for(int i:hashSet) {
            copy.add(i);
        }
        hashSet.add(member);
        Queue<Integer> queueCopy = new LinkedList<Integer>();
        for(int i:s){
            queueCopy.add(i);
        }
        generateSubsets(s, subsets, hashSet);
        generateSubsets(queueCopy, subsets, copy);          
    }
}

I looked at the solutions for this problem and the author said that the solution to this algorithm runs in O(2n) time complexity and O(2n) space complexity. I agree with her that this algorithm runs in O(2n) time because to solve this problem, you have to consider the fact that for any element, you have two possibilities, it can either be in the set or not. And because you have n elements, your problem will have 2n possibilities so the problem would be solved with O(2n) time.

However I believe that I have a compelling argument that my algorithm runs in O(n) space. I know that space complexity is "the total space taken by an algorithm with respect to the input size" Space Complexity and is relative to the the depth of a recursive call(remember this from some Youtube video I watched)

An example I have is generating [1,2,3] as a subset of [1,2,3]. Here is the set of recursive calls to generate that set
generateSubsets([], subsets, [1,2,3])
generateSubsets([3],subsets,[1,2])
generateSubsets([2,3],subsets,[1])
generateSubsets([1,2,3],subsets,[])

This show that the greatest depth of a recursive call with respect to the original set size n is n itself. Each of these recursive calls will have its own stack frame. So from this, I concluded that the space complexity is O(n) Does anyone see any flaws in my proof?

committedandroider
  • 8,711
  • 14
  • 71
  • 126

1 Answers1

8

You need to take into account all memory that is allocated by your algorithm (or, rather, the greatest amount of allocated memory that is "in use" at any time) - not only on the stack, but also on the heap. Each of the generated subsets is being stored in the subsets list, which will eventually contain 2n sets, each of size somewhere between 0 and n (with most of the sets containing around n / 2 elements) - so the space complexity is actually O(n 2n).

Aasmund Eldhuset
  • 37,289
  • 4
  • 68
  • 81
  • Oh ok, I agree with you that you also have to take into account the memory you allocated from the heap so that's how she got O(2^n). Wouldn't the size of the largest set operate independently of the number of sets though so you wouldn't multiply them? – committedandroider Mar 24 '15 at 05:42
  • @committedandroider You can definitely *enumerate* the subsets in O(n) space, but in this case the algorithm also decides to *store* them, and since there are 2^n of them, this takes a lot of space. – Niklas B. Mar 24 '15 at 08:06
  • The number of subsets containing _k_ items is equal to the number of subsets containing _n - k_ items. Therefore, the average subset size is _n / 2_, and the total size of the subsets is (n / 2) * 2^n = O(n * 2^n). – Aasmund Eldhuset Mar 24 '15 at 17:35
  • Can you guys explain why it be O(n) space complexity if you list listed them? I learned from here http://stackoverflow.com/questions/29224822/is-my-analysis-of-space-complexity-correct that you also have to take into account the memory you allocate from the heap, which in this case would be all the sizes of the set, n, n-1, n-2. So would it be O(n^2) then? – committedandroider Mar 24 '15 at 18:38
  • See the first parenthesis in my answer - it's only the maximal amount of memory you actually use at any given time that matters. Let's say that you allocate 1 MB, then 1 MB more, then free the first 1 MB (or simply stop referencing it if you're using a garbage collected language, such as Java), and allocate 1 MB more. Even though you've made 3 MB worth of allocations, at no time was more than 2 MB in use, so your space consumption would be 2 MB. (Note that the last block might well be allocated in the same physical space as the released block.) – Aasmund Eldhuset Mar 24 '15 at 18:44
  • How did you get this intuition? "The number of subsets containing k items is equal to the number of subsets containing n - k items.". I tested this and it works with a set of {1,2,3} but I don't understand how you came up with that. – committedandroider Mar 28 '15 at 20:29
  • @committedandroider: Mathematically, the number of ways to pick _r_ elements form a set of _n_ elements is called the [binomial coefficient](http://en.wikipedia.org/wiki/Binomial_coefficient) _C(n, r)_, and one of its fundamental properties is its symmetry: _C(n, r) = C(n, n - r)_. Intuitively, any selection of _r_ elements can also be thought of as a selection of the _n - r_ elements that are to be left out, so there must be as many ways to select _n - r_ elements as there are to select _r_ elements. – Aasmund Eldhuset Mar 28 '15 at 21:26
  • (An alternative formulation is that each subset with _r_ elements corresponds to a unique subset with _n - r_ elements: its complement.) – Aasmund Eldhuset Mar 28 '15 at 21:32
  • Oh ok that makes sense. I mathematically proved that too. How did you reach the conclusion "Therefore, the average subset size is n / 2,"? I now understand that say for a set of size 3, there will be same number of subsets of size 1 as there are subsets of size 2. How would you go from that to saying something about the average subset size? – committedandroider Mar 28 '15 at 21:50
  • Let's say that _ki_ is the number of subsets of size _i_, so the average subset size is _(k0 * 0 + k1 * 1 + k2 * 2 + ... + kn-2 * (n - 2) + kn-1 * (n - 1) + kn * n) / 2^n_. Since _k0 = kn_, _k1 = kn-1_ and so on, this equals _(k0 * 0 + k1 * 1 + k2 * 2 + ... + kn-2 * (n - 2) + kn-1 * (n - 1) + k0 * n) / 2^n = (k0 * (0 + n) + k1 * (1 + (n - 1)) + k2 * (2 + (n - 2)) + ... = (k0 * n + k1 * n + k2 * n + ...) / 2^n = n * (k0 + k1 + k2 + ...) / 2^n_. – Aasmund Eldhuset Mar 29 '15 at 00:26
  • The _n / 2_ terms inside the parenthesis are half the sum of all binomial coefficients of _n_, and because all bionomial coefficients represent all possible ways to pick elements from a set, their sum must equal _2^n_. Therefore, the expression equals _n * (2^n / 2) / 2^n = n / 2_. (This proof assumed that _n_ is even, but the proof for the odd case is similar.) More intuitively, every set is "balanced out" by its complement, and together, the average size of a set and its complement is _n / 2_. – Aasmund Eldhuset Mar 29 '15 at 00:27
  • Oh ok thanks. If you enumerate, I am still curious as to why the space complexity would be O(n). You mentioned garbage collection and I know that garbage collectors "free memory allocated to objects that are not being used by the program anymore" http://stackoverflow.com/questions/3798424/what-is-the-garbage-collector-in-java But here lets say when i work with the set {1,2,3}. When I enumerate the subset {1,2,3}, wouldn't the sets {}, {1}, {1,2} all be in memory under by implementation? – committedandroider Mar 29 '15 at 19:26
  • Not necessarily; a subset may be represented as e.g. a list containing the elements of the set, or a bit vector containing one element for each element of the universal set (each bit being 1 if the element is present in the set or 0 if not), so the memory consumption of one subset may be simply _O(n)_. – Aasmund Eldhuset Mar 29 '15 at 20:03