Given I have N
(1 ≤ N
) numbers, 1 .. N
, efficiently generate the maximum number of possible L
(1 ≤ L
≤ N
) sized combinations of it with the restriction that any U
length subset of a combination (1 ≤ U
≤ L
) only appears once in the result. There are usually many results satisfying the constraints - any result will do.
For example, if N
is 5, L
is 3 and the final constraint is dropped (ie, the answer is just the combinations), then we have:
123,124,125,134,135,145,234,235,245,345;
Once a U
is 2 constraint is introduced any of these lines is an acceptable solution, provided it is generated efficiently:
123,145;
123,245;
123,345;
124,135;
124,235;
124,345;
125,134;
125,234;
125,345;
The ideal run time is O(size_of_output). In my use case N is always much bigger (an order of magnitude or more) than L, so anything faster than calculating all the combinations will be an improvement on what I came up with up (which is too slow):
import itertools
def unique_combinations(population, length, unique):
seen = set()
for r in itertools.combinations(range(population), length):
u = set(itertools.combinations(r, unique))
if not (u & seen):
yield r
seen |= u
Bonus points for providing a way deterministically pick any given solution from the list of valid ones. For example, there were 9 valid solution to the (N=5,L=3,U=2)
example above. Being able to an additional parameter 1 .. 9 selecting which was returned would be really nice.
A simple formula for calculating the number of combinations in the result would also be helpful.