2

I have list with repeated elements, for example array = [2,2,2,7].

If I use the solution suggested in this answer (using itertools.combinations()), I get:

()
(7,)
(2,)
(2,)
(2,)
(7, 2)
(7, 2)
(7, 2)
(2, 2)
(2, 2)
(2, 2)
(7, 2, 2)
(7, 2, 2)
(7, 2, 2)
(2, 2, 2)
(7, 2, 2, 2)

As you can see some of the 'combinations' are repeated, e.g. (7,2,2) appears 3 times.

The output I would like is:

()
(7,)
(2,)
(7, 2)
(2, 2)
(7, 2, 2)
(2, 2, 2)
(7, 2, 2, 2)

I could check the output for repeated combinations but I don't feel like that is the best solution to this problem.

eugenhu
  • 1,168
  • 13
  • 22
  • 1
    You can find all of the combinations and then turn it into a set ? – BcK May 06 '18 at 13:46
  • @BcK that's just the same as checking the whole output in terms of approach, isn't it? I mean I thought that it s bad practice. –  May 06 '18 at 13:48
  • What is your concern with looping through the output a second time to only grab unique values? Doing this with a nested for loops is the intuitive way to do it. – Jon Reinhold May 06 '18 at 13:50

3 Answers3

4

You can take the set of the combinations and then chain them together.

from itertools import chain, combinations

arr = [2, 2, 2, 7]

list(chain.from_iterable(set(combinations(arr, i)) for i in range(len(arr) + 1)))
# [(), (7,), (2,), (2, 7), (2, 2), (2, 2, 2), (2, 2, 7), (2, 2, 2, 7)]
BcK
  • 2,548
  • 1
  • 13
  • 27
1

You would need to maintain a set of tuples that are sorted in the same fashion:

import itertools as it 

desired=set([(),(7,),(2,),(7, 2),(2, 2),(7, 2, 2),(2, 2, 2),(7, 2, 2, 2)])
result=set()
for i in range(len(array)+1):
    for combo in it.combinations(array, i):
        result.add(tuple(sorted(combo, reverse=True)))

>>> result==desired
True
dawg
  • 98,345
  • 23
  • 131
  • 206
0

Without using itertools.combinations() and set's:

from collections import Counter
import itertools

def powerset(bag):
    for v in itertools.product(*(range(r + 1) for r in bag.values())):
        yield Counter(zip(bag.keys(), v))

array = [2, 2, 2, 7]

for s in powerset(Counter(array)):
    # Convert `Counter` object back to a list
    s = list(itertools.chain.from_iterable(itertools.repeat(*mv) for mv in s))
    print(s)

I believe your problem could alternatively be stated as finding the power set of a multiset, at least according to this definition.


However it's worth noting that the method shown above will be slower than the solutions in other answers such as this one which simply group the results from itertools.combinations() into a set to remove duplicates, despite being seemingly less efficient, it is still faster in practice as iterating in Python is much slower than in C (see itertoolsmodule.c for the implementation of itertools.combinations()).

Through my limited testing, the method shown in this answer will outperform the previously cited method when there are approximately 14 distinct elements in your array, each with an average multiplicity of 2 (at which point the other method begins to pull away and run many times slower), however the running time for either method under those circumstances are >30 seconds, so if performance is of concern, then you might want to consider implementing this part of your application in C.

eugenhu
  • 1,168
  • 13
  • 22