8

I would like to generate a list of combinations. I will try to simplify my problem to make it understandable.

We have 3 variables :

  • x : number of letters
  • k : number of groups
  • n : number of letters per group

I would like to generate using python a list of every possible combinations, without any duplicate knowing that : i don't care about the order of the groups and the order of the letters within a group.

As an example, with x = 4, k = 2, n = 2 :

# we start with 4 letters, we want to make 2 groups of 2 letters
letters = ['A','B','C','D']

# here would be a code that generate the list

# Here is the result that is very simple, only 3 combinations exist.
combos = [ ['AB', 'CD'], ['AC', 'BD'], ['AD', 'BC'] ]

Since I don't care about the order of or within the groups, and letters within a group, ['AB', 'CD'] and ['DC', 'BA'] is a duplicate.

This is a simplification of my real problem, which has those values : x = 12, k = 4, n = 3. I tried to use some functions from itertools, but with that many letters my computer freezes because it's too many combinations.

Another way of seeing the problem : you have 12 players, you want to make 4 teams of 3 players. What are all the possibilities ?

Could anyone help me to find an optimized solution to generate this list?

Nested
  • 83
  • 4
  • Do you always have `x = k * n`, or is it `x >= k * n`? – Mad Physicist Mar 31 '22 at 17:47
  • This code will give you all the combinations: `comb = [(a,b) for a in letters for b in letters if a != b]`. I don't quite know how to remove the duplicates – António Rebelo Mar 31 '22 at 17:49
  • If `itertools` freezes your computer, I don't think you could implement something better using only Python – David Davó Mar 31 '22 at 17:58
  • Have you computed how many combinations you are talking about? Off the top of my head I think it is more than 1 billion. I may have misunderstood something or made an error, but if I'm anywhere close then the problem is hopeless. Even if you managed to compute the answer you would have no good way of looking at it. – Paul Cornelius Mar 31 '22 at 18:01
  • 4
    @DavidDavó. That's clearly not true – Mad Physicist Mar 31 '22 at 18:06
  • @PaulCornelius. You can easily make a generator for this... – Mad Physicist Mar 31 '22 at 18:06
  • 1
    Yes x = k*n. To make the problem more clear, i have 12 players, i want to make 4 teams of 3 players and would like to have all the possibilities. I couldnt compute all the combinations but it's less than 1 billion. If there were no duplicate, it would be fact(12) = 479.001.600 combos. But it will be way less because there are a lot of duplicates. For example with x = 4, fact(4) = 24 but whne you remove duplicates there are only 3 combos – Nested Mar 31 '22 at 18:07
  • Itertools is implemented in cpython, and returns a generator. I doubt a python's one liner could perform better – David Davó Mar 31 '22 at 18:10
  • Isn’t `k` determined by `x` and `n`? Can you provide an example where it’s not? – Abhijit Sarkar Mar 31 '22 at 18:23
  • 1
    yes k is determined by x and n. – Nested Mar 31 '22 at 18:24
  • @DavidDavó itertools generate all the possibilities (400 millions for x = 12). According to some comments, after removing duplicates i would have 369.000 combos. The idea is not to perform better than itertools, but to find a solution that is more specific to my problem. – Nested Mar 31 '22 at 18:26
  • 1
    @JohnColeman I think that also counts all orderings of the partitions. I think you'll have to divide that by k! again for a total of 15,400 in this case. – beaker Mar 31 '22 at 18:29
  • @beaker You are correct – John Coleman Mar 31 '22 at 18:30
  • 2
    The function `get_partitions_same_size` from my answer [here](https://stackoverflow.com/questions/71218611/find-all-unordered-partitions-of-a-list-with-a-minimum-number-of-elements/71233372#71233372) could be helpful. – hilberts_drinking_problem Mar 31 '22 at 19:47
  • I timed three solutions on my computer: @fsimonjetz ~60 seconds, my first ~1.7 seconds, my second ~0.36 seconds. – md2perpe Apr 01 '22 at 11:29

5 Answers5

2

There will certainly be more sophisticated/efficient ways of doing this, but here's an approach that works in a reasonable amount of time for your example and should be easy enough to adapt for other cases.

It generates unique teams and unique combinations thereof, as per your specifications.

from itertools import combinations

# this assumes that team_size * team_num == len(players) is a given
team_size = 3
team_num = 4
players = list('ABCDEFGHIJKL')
unique_teams = [set(c) for c in combinations(players, team_size)]

def duplicate_player(combo):
    """Returns True if a player occurs in more than one team"""
    return len(set.union(*combo)) < len(players)
    
result = (combo for combo in combinations(unique_teams, team_num) if not duplicate_player(combo))

result is a generator that can be iterated or turned into a list with list(result). On kaggle.com, it takes a minute or so to generate the whole list of all possible combinations (a total of 15400, in line with the computations by @beaker and @John Coleman in the comments). The teams are tuples of sets that look like this:

[({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'I'}, {'J', 'K', 'L'}),
 ({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'J'}, {'I', 'K', 'L'}),
 ({'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'K'}, {'I', 'J', 'L'}),
 ...
]

If you want, you can cast them into strings by calling ''.join() on each of them.

fsimonjetz
  • 5,644
  • 3
  • 5
  • 21
1

Another solution (players are numbered 0, 1, ...):

import itertools

def equipartitions(base_count: int, group_size: int):
    if base_count % group_size != 0:
        raise ValueError("group_count must divide base_count")

    return set(_equipartitions(frozenset(range(base_count)), group_size))


def _equipartitions(base_set: frozenset, group_size: int):
    if not base_set:
        yield frozenset()

    for combo in itertools.combinations(base_set, group_size):
        for rest in _equipartitions(base_set.difference(frozenset(combo)), group_size):
            yield frozenset({frozenset(combo), *rest})


all_combinations = [
    [tuple(team) for team in combo]
        for combo in equipartitions(12, 3)
]

print(all_combinations)
print(len(all_combinations))

And another:

import itertools
from typing import Iterable

def equipartitions(players: Iterable, team_size: int):
    if len(players) % team_size != 0:
        raise ValueError("group_count must divide base_count")

    return _equipartitions(set(players), team_size)


def _equipartitions(players: set, team_size: int):
    if not players:
        yield []
        return

    first_player, *other_players = players
    for other_team_members in itertools.combinations(other_players, team_size-1):
        first_team = {first_player, *other_team_members}
        for other_teams in _equipartitions(set(other_players) - set(first_team), team_size):
            yield [first_team, *other_teams]


all_combinations = [
    {''.join(sorted(team)) for team in combo} for combo in equipartitions(players='ABCDEFGHIJKL', team_size=3)
]


print(all_combinations)
print(len(all_combinations))
md2perpe
  • 3,372
  • 2
  • 18
  • 22
  • I guess that's one of the "more sophisticated/efficient ways of doing this" :) very insightful! – fsimonjetz Apr 01 '22 at 12:28
  • 1
    @fsimonjetz. I think that your code was clever. But it was slower. On the other hand, if this is something that will just be run manually a few times then waiting a minute is no problem. – md2perpe Apr 01 '22 at 13:29
0

Firstly, you can use a list comprehension to give you all of the possible combinations (regardless of the duplicates):

comb = [(a,b) for a in letters for b in letters if a != b]

And, afterwards, you can use the sorted function to sort the tuples. After that, to remove the duplicates, you can convert all of the items to a set and then back to a list.

var = [tuple(sorted(sub)) for sub in comb]
var = list(set(var))
António Rebelo
  • 186
  • 1
  • 13
  • This is a bit of a waste to generate combinations to discard them afterwards, and this solution doesn't work out-of-the-box for any `n` (it currently hard-codes `n=2`). – jthulhu Mar 31 '22 at 18:05
  • generating all the possibilities and removing the duplicates was my first idea, but with x = 12 this is 400 millions combos. Way too much for my computer to compiute. According to some comments, my final combo list should be around 15.400 combos (not verified). Should be possible to generate this list without having to generate all 400 millions combos – Nested Mar 31 '22 at 18:29
  • @Nested Yeah fair enough. Hadn't thought of that. I'm afraid I can't help much then – António Rebelo Mar 31 '22 at 18:37
0

You could use the list comprehension approach, which has a time complexity of O(n*n-1), or you could use a more verbose way, but with a slightly better time complexity of O(n^2-n)/2:

comb = []

for first_letter_idx, _ in enumerate(letters):
    for sec_letter_idx in range(first_letter_idx + 1, len(letters)):
        comb.append(letters[first_letter_idx] + letters[sec_letter_idx])

print(comb)

comb2 = []

for first_letter_idx, _ in enumerate(comb):
    for sec_letter_idx in range(first_letter_idx + 1, len(comb)):
        if (comb[first_letter_idx][0] not in comb[sec_letter_idx]
        and comb[first_letter_idx][1] not in comb[sec_letter_idx]):
            comb2.append([comb[first_letter_idx], comb[sec_letter_idx]])

print(comb2)

This algorithm needs more work to handle dynamic inputs. Maybe with recursion.

-1

Use combination from itertools

from itertools import combinations 

x = list(combinations(['A','B','C','D'],2))

t = []
for i in (x):
    t.append(i[0]+i[1]) # concatenating the strings and adding in a list

g = []
for i in range(0,len(t),2):
    for j in range(i+1,len(t)):
        g.append([t[i],t[j]])
        break
print(g)
logi
  • 66
  • 1
  • 1
  • 6
  • this code is generating the 6 combinations of 2 letters. That's not what i'm interested in. The final result should be the 3 combinations i gave in my first post. One combination is 2 group of 2 letters. ['AB', 'CD'] is one combination. – Nested Mar 31 '22 at 19:13
  • Check if this works for you..Since any pair like [AB, AC] in reverse as well as within element letters in reverse like BA, CA is considered as duplicate..the last for loop jumps 2 indexes in iterating the list created by the combinations.. – logi Mar 31 '22 at 19:52