2

I'm trying to use itertools.combinations to return unique combinations. I've searched through several similar questions but have not been able to find an answer.

An example:

>>> import itertools
>>> e = ['r','g','b','g']
>>> list(itertools.combinations(e,3))
[('r', 'g', 'b'), ('r', 'g', 'g'), ('r', 'b', 'g'), ('g', 'b', 'g')]

For my purposes, (r,g,b) is identical to (r,b,g) and so I would want to return only (rgb),(rgg) and (gbg).

This is just an illustrative example and I would want to ignore all such 'duplicates'. The list e could contain up to 5 elements. Each individual element would be either r, g or b. Always looking for combinations of 3 elements from e.

To be concrete, the following are the only combinations I wish to call 'valid': (rrr), (ggg), (bbb), (rgb).

So perhaps the question boils down to how to treat any variation of (rgb) as equal to (rgb) and therefore ignore it.

Can I use itertools to achieve this or do I need to write my own code to drop the 'dupliates' here? If no itertools solution then I can just easily check if each is a variation of (rgb), but this feels a bit 'un-pythonic'.

MSeifert
  • 145,886
  • 38
  • 333
  • 352
Bangkockney
  • 135
  • 5
  • So, why don't you simply choose one among your four valid outputs? – Thierry Lathuille Jun 01 '17 at 12:03
  • Sorry if not clear but I want to return all valid combinations subject to what I consider 'valid' as noted in the question. – Bangkockney Jun 01 '17 at 12:04
  • 2
    And what about `rrb` `rrg` etc..? Also if you're sure that your only valid outputs are `rrr`, `ggg`, `bbb` and `rgb` well then just output them, there's nothing to calculate? – Seb D. Jun 01 '17 at 12:10
  • [See this question.](https://stackoverflow.com/questions/36254303/how-many-different-ways-to-fill-n-number-bins-with-b-number-of-balls) Consider `r`, `g`, and `b` to be the bins with a capacity equal to their counts. Then, apply the algorithm in the accepted answer. – Jared Goguen Jun 05 '17 at 13:51

3 Answers3

2

You can use a set to discard duplicates.

In your case the number of characters is the way you identify duplicates so you could use collections.Counter. In order to save them in a set you need to convert them to frozensets though (because Counter isn't hashable):

>>> import itertools
>>> from collections import Counter
>>> e = ['r','g','b','g']
>>> result = []
>>> seen = set()
>>> for comb in itertools.combinations(e,3):
...     cnts = frozenset(Counter(comb).items())
...     if cnts in seen:
...         pass
...     else:
...         seen.add(cnts)
...         result.append(comb)
>>> result
[('r', 'g', 'b'), ('r', 'g', 'g'), ('g', 'b', 'g')]

If you want to convert them to strings use:

result.append(''.join(comb))  # instead of result.append(comb)

and it will give:

['rgb', 'rgg', 'gbg']

The approach is a variation of the unique_everseen recipe (itertools module documentation) - so it's probably "quite pythonic".

MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • Thank you for this. Whilst it didn't produce what I was looking for, the technique was new to me and your link was insightful so +1. – Bangkockney Jun 17 '17 at 18:42
1

According to your definition of "valid outputs", you can directly build them like this:

from collections import Counter

# Your distinct values
values = ['r', 'g', 'b']

e = ['r','g','b','g', 'g']

count = Counter(e)
# Counter({'g': 3, 'r': 1, 'b': 1})

# If x appears at least 3 times, 'xxx' is a valid combination  
combinations = [x*3 for x in values if count[x] >=3]

# If all values appear at least once, 'rgb' is a valid combination
if all([count[x]>=1 for x in values]):
    combinations.append('rgb')

print(combinations)
#['ggg', 'rgb']

This will be more efficient than creating all possible combinations and filtering the valid ones afterwards.

Thierry Lathuille
  • 23,663
  • 10
  • 44
  • 50
0

It is not completely clear what you want to return. It depends on what comes first when iterating. For example if gbr is found first, then rgb will be discarded as a duplicate:

import itertools

e = ['r','g','b','g']       
s = set(e)
v = [s] * len(s)

solns = []

for c in itertools.product(*v):
    in_order = sorted(c)
    if in_order not in solns:
        solns.append(in_order)

print solns  

This would give you:

[['r', 'r', 'r'], ['b', 'r', 'r'], ['g', 'r', 'r'], ['b', 'b', 'r'], ['b', 'g', 'r'], ['g', 'g', 'r'], ['b', 'b', 'b'], ['b', 'b', 'g'], ['b', 'g', 'g'], ['g', 'g', 'g']]
Martin Evans
  • 45,791
  • 17
  • 81
  • 97