4

I'm given a list of lists s:

s = [["a1", "A"], ["b4", "B"], ["a3", "A"], ["d6", "D"], ["c4", "C"]]

(note that the elements in a list do not necessarily begin with a same letter. I modified the data here for convenience.)

My goal is to sort each list to a category by its second element, and get all possible combinations by picking at most one element in each category.

I first hashed the list of lists to a dictionary:

dic = {i[1]: [] for i in s}
for i in s:
    # set the value of the first item key to the second item
    dic[i[1]].append(i[0])

dic
>>> {'A': ['a1', 'a3'], 'B': ['b4'], 'C': ['c4'], 'D': ['d6']}

The number of all possible combinations, hence the length of a powerset of s, should return 23:

{'a1'},
{'a3'},
{'b4'},
{'c4'}, 
{'d6'}, 
{'a1', 'b4'}, 
{'a1', 'c4'}, 
{'a1', 'd6'}, 
{'a3', 'b4'}, 
{'a3', 'c4'}, 
{'a3', 'd6'}, 
{'b4', 'c4'}, 
{'b4', 'd6'}, 
{'c4', 'd6'}, 
{'a1', 'b4', 'c4'}, 
{'a1', 'b4', 'd6'}, 
{'a1', 'c4', 'd6'}, 
{'a3', 'b4', 'c4'}, 
{'a3', 'b4', 'd6'}, 
{'a3', 'c4', 'd6'}, 
{'b4', 'c4', 'd6'}, 
{'a1', 'b4', 'c4', 'd6'}, 
{'a3', 'b4', 'c4', 'd6'}

I initially was going to put multiple for loops, but since I have no guarantee in how many key I would have in my s (which would also put my time complexity at O(N^x)), I used itertools.chain and itertools.combinations as per this post:

def powerset(s:list):
    return chain.from_iterable(combinations(s, r) for r in range(1, len(s)+1))

The problem is that this only takes elements in a single list to account, hence neglects the constraint: 'not taking more than one element from each list at most'. Flattening a list would disregard the categories, so I've not attempted to do so.

Any insights to tackle this problem would be appreciated.

jstaxlin
  • 517
  • 4
  • 18
  • 1
    Why doesn't your expected output also include `{}`? That's normally considered part of a powerset or cartesian product. – Karl Knechtel Sep 03 '21 at 01:50
  • Please also see https://stackoverflow.com/questions/533905/get-the-cartesian-product-of-a-series-of-lists (I reopened; this isn't close enough to be a duplicate, and the answer you're given shows a key and non-obvious technique.) – Karl Knechtel Sep 03 '21 at 01:53

2 Answers2

5

@don'ttalkjustcode's answer works but unnecessarily incurs the overhead of adding dummy values, and also produces an empty set, which is not required by the question.

A more direct approach would be to use itertools.combinations to pick lists from the dict of lists to pass to itertools.product to produce the desired combinations:

from itertools import product, combinations

print(*(
    set(p)
    for r in range(len(dic))
    for c in combinations(dic.values(), r + 1)
    for p in product(*c)
), sep='\n')

This outputs:

{'a1'}
{'a3'}
{'b4'}
{'c4'}
{'d6'}
{'a1', 'b4'}
{'a3', 'b4'}
{'a1', 'c4'}
{'a3', 'c4'}
{'d6', 'a1'}
{'d6', 'a3'}
{'c4', 'b4'}
{'d6', 'b4'}
{'d6', 'c4'}
{'a1', 'c4', 'b4'}
{'a3', 'c4', 'b4'}
{'d6', 'a1', 'b4'}
{'d6', 'a3', 'b4'}
{'d6', 'a1', 'c4'}
{'d6', 'a3', 'c4'}
{'d6', 'c4', 'b4'}
{'d6', 'a1', 'c4', 'b4'}
{'d6', 'a3', 'c4', 'b4'}
blhsing
  • 91,368
  • 6
  • 71
  • 106
3

You could turn each category list like ['a1', 'a3'] into a list like [[], ['a1'], ['a3']], take the product of those, and chain each product (Try it online!):

from itertools import product, chain

dic = {'A': ['a1', 'a3'], 'B': ['b4'], 'C': ['c4'], 'D': ['d6']}
for p in product(*([[]] + [[s] for s in v] for v in dic.values())):
    print({*chain(*p)})

Or add a special "skip"-value to each category and filter it out (Try it online!):

from itertools import product, chain

skip = object()

dic = {'A': ['a1', 'a3'], 'B': ['b4'], 'C': ['c4'], 'D': ['d6']}
for p in product(*([skip] + v for v in dic.values())):
    print(set(p) - {skip})

Hmm, I started with your already processed dic because you did that well, but ... if you start with dic = {i[1]: [skip] for i in s} right away, then you don't need to add it later and can just do this (Try it online!):

for p in product(*dic.values()):
    print(set(p) - {skip})

Doing the same with the first allows (Try it online!):

for p in product(*dic.values()):
    print({*chain(*p)})

Output of all (last two might show different orders due to string hash randomization):

set()
{'d6'}
{'c4'}
{'d6', 'c4'}
{'b4'}
{'d6', 'b4'}
{'b4', 'c4'}
{'d6', 'b4', 'c4'}
{'a1'}
{'d6', 'a1'}
{'a1', 'c4'}
{'d6', 'a1', 'c4'}
{'b4', 'a1'}
{'d6', 'b4', 'a1'}
{'b4', 'a1', 'c4'}
{'d6', 'b4', 'a1', 'c4'}
{'a3'}
{'d6', 'a3'}
{'c4', 'a3'}
{'d6', 'c4', 'a3'}
{'b4', 'a3'}
{'d6', 'b4', 'a3'}
{'c4', 'b4', 'a3'}
{'d6', 'c4', 'b4', 'a3'}
no comment
  • 6,381
  • 4
  • 12
  • 30
  • Exactly so. The key insight here is that for an input list of N elements, there are N+1 possibilities: each of the elements, or no element. So we need an extra level of list wrapping in order to account for the number of elements to use for a given input, hence the transformation. – Karl Knechtel Sep 03 '21 at 01:52
  • @KarlKnechtel Added an alternative without wrapping :-) – no comment Sep 03 '21 at 01:56
  • That's the same kind of workaround, just with filtering dummy values instead of flattening a dummy layer of lists. But yeah, I can see reasons to prefer that aesthetically. – Karl Knechtel Sep 03 '21 at 01:58
  • @KarlKnechtel Third solution (also applicable to first one) is prettiest... – no comment Sep 03 '21 at 02:05