One way to solve this is by backtracking. Here's a possible algorithm in pseudo code:
def backtrack(input_set, idx, partial_res, res, n):
if len(partial_res == n):
res.append(partial_res[:])
return
for i in range(idx, len(input_set)):
partial_res.append(input_set[i])
backtrack(input_set, idx+1, partial_res, res, n) # path with input_set[i]
partial_res.pop()
backtrack(input_set, idx+1, partial_res, res, n) # path without input_set[i]
Time complexity of this approach is O(2^len(input_set))
since we make 2 branches at each element of input_set
, regardless of whether the path leads to a valid result or not. The space complexity is O(len(input_set) choose n)
since this is the number of valid subsets you get, as you correctly pointed out in your question.
Now, there is a way to optimize the above algorithm to reduce the time complexity to O(len(input_set) choose n)
by pruning the recursive tree to paths that can lead to valid results only.
If n - len(partial_res) < len(input_set) - idx + 1
, we are sure that even if we took every remaining element in input_set[idx:]
we are still short at least one to reach n
. So we can employ this as a base case and return and prune.
Also, if n - len(partial_res) == len(input_set) - idx + 1
, this means that we need each and every element in input_set[idx:]
to get the required n
length result. Thus, we can't skip any elements and so the second branch of our recursive call becomes redundant.
backtrack(input_set, idx+1, partial_res, res, n) # path without input_set[i]
We can skip this branch with a conditional check.
Implementing these base cases correctly, reduces the time complexity of the algorithm to O(len(input_set) choose k)
, which is a hard limit because that's the number of subsets that there are.