My Problem
I am trying to generate a small subset of possible combinations from a very large Cartesian Product. My input would be an array of arrays, but the size of each array is dynamic. For now, I am using Python, but I am open to any language that needs to be used.
My Goal
After seeing this question: How to select specific item from cartesian product without calculating every other item, I thought that was an amazing algorithm that could generate a set given an index. However, that only specifically works with 3 arrays. My end goal would be something like this, where the function that determines the set is find_set
:
Input:
A = [ A_0, A_1, A_2, ..., A_n ]
B = [ B_0, B_1, B_2, ..., B_n ]
C = [ C_0, C_1, C_2, ..., C_n ]
D = [ D_0, D_1, D_2, ..., D_n ]
...
N = [ N_0, N_1, D_2, ..., N_n ]
List = [ A, B, C, D, ... N]
find_set(List, 0) -> [ A_0, B_0, C_0, ..., N_0 ]
find_set(List, 1) -> [ A_0, B_0, C_0, ..., N_1 ]
...
And so on and so forth, for any given index.
What I've Done So Far
I was using Python 2.7 and itertools.product
to generate all of the combinations, but this only makes an iterator. After running into memory consumption problems, I tried something like this:
results = itertools.product(*List)
# generates 100,000 random indices between 0 and the size of the Cartesian Product
desired_indices = random.sample(xrange(0, calculated_size - 1), 100000)
for item, index in results:
if index in desired_indices:
# do things
The thing is, this leads to O(N) operations regardless, and when I have 433,501,216 possible sets, this will take a VERY long time just to find an incredibly small subset. I appreciate all the help and any other resources I can look to for more knowledge on the subject.