Here's a reasonably efficient algorithm. It's ok on small data sets, but it will take time & RAM for larger data sets. I won't claim it's the best algorithm, but it's certainly much better than brute-force checking each good - bad pairing.
The key insight is that we don't need to look at the individual good - bad pairings, we can look at the total good - bad score of whole teams. Let the scores for two valid teams be (X, A) and (Y, B). That is, X is the total score of all the good players in team 1 and A is the total score of all the bad players in team 1. Similarly, Y is the score of the good players in team 2 and B is the score of the bad players in that team. Then
X - A = Y - B
X + B = Y + A
2(X + B) = X + B + Y + A
X + B = (X + B + Y + A) / 2
Let M = (X + Y + A + B) / 2
Thus B = M - X
So we just need to calculate M, the total of all players' scores, and then for each good player partition X we need to see if there's a matching bad player partition B. (Incidentally, the arithmetic above shows that X + Y + A + B must be even for a solution to exist, since B & X must be whole numbers).
To do that with reasonable efficiency, we create a dict of the bad player partitions. The dict key is the total score for that partition, the associated value is a list of all bad player partitions with that score.
Then we loop over the good player partition pairs. We find the total score for the 1st partition in a pair, subtract it from M to get the required bad score, and see if that bad score's in the dictionary. If it is, we've found a solution.
The function equal_parts
is the fastest way I know in pure Python to split a list into all of its equal length partitions, see this answer I wrote a couple of years ago for details.
The code below finds a single solution for the data given in the question; that solution yields 2 options for each team. I've included some extra code (currently commented out) to generate some random fake data in order to test the code on a larger data set.
Update
There was a minor bug in the logic of the previous version. When I found a list of bad partitions with the correct total score b1
I used bad_total - b1
to get the complementary list of bad partitions for the other team. But that doesn't always work correctly. :oops: The new version uses a bad_parts
dictionary that stores each bad partition both as a key and a value, which makes it easy to get the correct complementary partition.
Also, the old make_pairs
function was a bit sloppy, so it could produce Team 1 lists and Team 2 lists that shared players. :other oops:. We now have a new function make_teams
that calls a repaired make_pairs
. The make_teams
function prints the various Team 1 and Team 2 permutations in sections. Within each section no Team 1 list will share players with a Team 2 list.
I've also made a few other minor changes, the main one being that the names in a partition tuple are always sorted alphabetically. That makes their use as dictionary keys more robust. I've also tweaked the output format a little.
It may sometimes happen (eg in Solution 23 of the data you supplied in a comment) that an alleged Solution doesn't work: there's no permutation of bad players that can be combined with good players without generating a negative score. Hopefully, that's not a major problem. To make the code deal with that automatically we'd need to change it so that instead of printing results as it goes it would need to store the results in lists (or perhaps use generators), and then test that the resulting lists aren't empty before printing that solution. This would consume a lot more RAM, and would make the logic even more cluttered than it currently is.
from itertools import combinations, permutations
from collections import defaultdict
from random import seed, randrange
seed(37)
def equal_parts(lst):
''' yield all equal-sized pair partitions of lst '''
first = lst[0]
allset_diff = set(lst).difference
for left in combinations(lst, len(lst) // 2):
if left[0] != first:
break
yield left, tuple(sorted(allset_diff(left)))
def find_partitions(bad, good):
print('bad', bad, len(bad))
print('good', good, len(good))
bad_total = sum(bad.values())
good_total = sum(good.values())
total = bad_total + good_total
if total % 2 != 0:
print('No solutions!')
return
magic = total // 2
print('magic =', magic)
# Create a dict of the good partition pairs
good_parts = dict(equal_parts(sorted(good)))
# Create a dict of the bad partition pairs, with each partition of
# the pair as a key and the complementary partiton as the value
bad_parts = {}
for bpart1, bpart2 in equal_parts(sorted(bad)):
bad_parts[bpart1] = bpart2
bad_parts[bpart2] = bpart1
#print(bad_parts)
# Get the sums of all the bad partitions, and save them in
# a dict of lists with the partition sum as the key and
# the partition in the value list
bad_sums = defaultdict(list)
for bpart in bad_parts:
s = sum(bad[k] for k in bpart)
bad_sums[s].append(bpart)
bad_sums = dict(bad_sums)
#print(bad_sums)
# Sum the 1st of each pair of good partitions & see if there's a matching bad partition
count = 0
for gpart1, gpart2 in good_parts.items():
g1 = sum(good[k] for k in gpart1)
b1 = magic - g1
if b1 in bad_sums:
count += 1
print('SOLUTION', count)
g2 = good_total - g1
b2 = bad_total - b1
blist1 = bad_sums[b1]
# Create the complementary list of bad partitions
blist2 = [bad_parts[k] for k in blist1]
tot1 = g1 - b2
tot2 = g2 - b1
print(gpart1, g1, '-', blist2, b2, '=', tot1)
print(gpart2, g2, '-', blist1, b1, '=', tot2, '\n')
make_teams(gpart1, gpart2, blist1, blist2)
def make_pairs(gpart, bpart):
for b in permutations(bpart):
total = 0
team = []
for gkey, bkey in zip(gpart, b):
score = good[gkey] - bad[bkey]
if score <= 0:
# Invalid pairing
break
total += score
team.append('{}-{}={}'.format(gkey, bkey, score))
else:
print(', '.join(team), total)
def make_teams(gpart1, gpart2, blist1, blist2):
section = 0
for bpart2, bpart1 in zip(blist2, blist1):
section += 1
print('Section', section)
print('Team 1:', ' '.join(gpart1), '+', ' '.join(bpart2))
make_pairs(gpart1, bpart2)
print('\nTeam 2:', ' '.join(gpart2), '+', ' '.join(bpart1))
make_pairs(gpart2, bpart1)
print()
# Make some fake data
def make_data(letters, lo, hi):
return {s: randrange(lo, hi) for s in letters}
#while True:
#bad = make_data('ZYXWVU', 1, 15)
#good = make_data('ABCDEF', 10, 30)
#bad_total = sum(bad.values())
#good_total = sum(good.values())
#if bad_total % 2 == good_total % 2:
#break
bad = {'A': 13, 'B': 6, 'C': 9, 'D': 2,}
good = {'X': 25, 'Y': 16, 'Z': 17, 'K': 10,}
#bad = {'bA': 7, 'bB': 10, 'bC': 2, 'bD': 12, 'bE': 15, 'bF': 14, 'bG': 17, 'bH': 15}
#good = {'gA': 19, 'gB': 36, 'gC': 9, 'gD': 15, 'gE': 24, 'gF': 23, 'gG': 24, 'gH': 24}
find_partitions(bad, good)
output
bad {'A': 13, 'B': 6, 'C': 9, 'D': 2} 4
good {'X': 25, 'Y': 16, 'Z': 17, 'K': 10} 4
magic = 49
SOLUTION 1
('K', 'Z') 27 - [('B', 'D')] 8 = 19
('X', 'Y') 41 - [('A', 'C')] 22 = 19
Section 1
Team 1: K Z + B D
K-B=4, Z-D=15 19
K-D=8, Z-B=11 19
Team 2: X Y + A C
X-A=12, Y-C=7 19
X-C=16, Y-A=3 19
Here's the output from the fake data.
bad {'Z': 2, 'Y': 9, 'X': 5, 'W': 6, 'V': 11, 'U': 12} 6
good {'A': 23, 'B': 28, 'C': 28, 'D': 21, 'E': 28, 'F': 11} 6
magic = 92
SOLUTION 1
('A', 'B', 'C') 79 - [('U', 'V', 'Y')] 32 = 47
('D', 'E', 'F') 60 - [('W', 'X', 'Z')] 13 = 47
Section 1
Team 1: A B C + U V Y
A-U=11, B-V=17, C-Y=19 47
A-U=11, B-Y=19, C-V=17 47
A-V=12, B-U=16, C-Y=19 47
A-V=12, B-Y=19, C-U=16 47
A-Y=14, B-U=16, C-V=17 47
A-Y=14, B-V=17, C-U=16 47
Team 2: D E F + W X Z
D-W=15, E-X=23, F-Z=9 47
D-W=15, E-Z=26, F-X=6 47
D-X=16, E-W=22, F-Z=9 47
D-X=16, E-Z=26, F-W=5 47
D-Z=19, E-W=22, F-X=6 47
D-Z=19, E-X=23, F-W=5 47
SOLUTION 2
('A', 'B', 'D') 72 - [('U', 'V', 'Z'), ('V', 'X', 'Y')] 25 = 47
('C', 'E', 'F') 67 - [('W', 'X', 'Y'), ('U', 'W', 'Z')] 20 = 47
Section 1
Team 1: A B D + U V Z
A-U=11, B-V=17, D-Z=19 47
A-U=11, B-Z=26, D-V=10 47
A-V=12, B-U=16, D-Z=19 47
A-V=12, B-Z=26, D-U=9 47
A-Z=21, B-U=16, D-V=10 47
A-Z=21, B-V=17, D-U=9 47
Team 2: C E F + W X Y
C-W=22, E-X=23, F-Y=2 47
C-W=22, E-Y=19, F-X=6 47
C-X=23, E-W=22, F-Y=2 47
C-X=23, E-Y=19, F-W=5 47
C-Y=19, E-W=22, F-X=6 47
C-Y=19, E-X=23, F-W=5 47
Section 2
Team 1: A B D + V X Y
A-V=12, B-X=23, D-Y=12 47
A-V=12, B-Y=19, D-X=16 47
A-X=18, B-V=17, D-Y=12 47
A-X=18, B-Y=19, D-V=10 47
A-Y=14, B-V=17, D-X=16 47
A-Y=14, B-X=23, D-V=10 47
Team 2: C E F + U W Z
C-U=16, E-W=22, F-Z=9 47
C-U=16, E-Z=26, F-W=5 47
C-W=22, E-U=16, F-Z=9 47
C-Z=26, E-U=16, F-W=5 47
SOLUTION 3
('A', 'B', 'E') 79 - [('U', 'V', 'Y')] 32 = 47
('C', 'D', 'F') 60 - [('W', 'X', 'Z')] 13 = 47
Section 1
Team 1: A B E + U V Y
A-U=11, B-V=17, E-Y=19 47
A-U=11, B-Y=19, E-V=17 47
A-V=12, B-U=16, E-Y=19 47
A-V=12, B-Y=19, E-U=16 47
A-Y=14, B-U=16, E-V=17 47
A-Y=14, B-V=17, E-U=16 47
Team 2: C D F + W X Z
C-W=22, D-X=16, F-Z=9 47
C-W=22, D-Z=19, F-X=6 47
C-X=23, D-W=15, F-Z=9 47
C-X=23, D-Z=19, F-W=5 47
C-Z=26, D-W=15, F-X=6 47
C-Z=26, D-X=16, F-W=5 47
SOLUTION 4
('A', 'C', 'D') 72 - [('U', 'V', 'Z'), ('V', 'X', 'Y')] 25 = 47
('B', 'E', 'F') 67 - [('W', 'X', 'Y'), ('U', 'W', 'Z')] 20 = 47
Section 1
Team 1: A C D + U V Z
A-U=11, C-V=17, D-Z=19 47
A-U=11, C-Z=26, D-V=10 47
A-V=12, C-U=16, D-Z=19 47
A-V=12, C-Z=26, D-U=9 47
A-Z=21, C-U=16, D-V=10 47
A-Z=21, C-V=17, D-U=9 47
Team 2: B E F + W X Y
B-W=22, E-X=23, F-Y=2 47
B-W=22, E-Y=19, F-X=6 47
B-X=23, E-W=22, F-Y=2 47
B-X=23, E-Y=19, F-W=5 47
B-Y=19, E-W=22, F-X=6 47
B-Y=19, E-X=23, F-W=5 47
Section 2
Team 1: A C D + V X Y
A-V=12, C-X=23, D-Y=12 47
A-V=12, C-Y=19, D-X=16 47
A-X=18, C-V=17, D-Y=12 47
A-X=18, C-Y=19, D-V=10 47
A-Y=14, C-V=17, D-X=16 47
A-Y=14, C-X=23, D-V=10 47
Team 2: B E F + U W Z
B-U=16, E-W=22, F-Z=9 47
B-U=16, E-Z=26, F-W=5 47
B-W=22, E-U=16, F-Z=9 47
B-Z=26, E-U=16, F-W=5 47
SOLUTION 5
('A', 'C', 'E') 79 - [('U', 'V', 'Y')] 32 = 47
('B', 'D', 'F') 60 - [('W', 'X', 'Z')] 13 = 47
Section 1
Team 1: A C E + U V Y
A-U=11, C-V=17, E-Y=19 47
A-U=11, C-Y=19, E-V=17 47
A-V=12, C-U=16, E-Y=19 47
A-V=12, C-Y=19, E-U=16 47
A-Y=14, C-U=16, E-V=17 47
A-Y=14, C-V=17, E-U=16 47
Team 2: B D F + W X Z
B-W=22, D-X=16, F-Z=9 47
B-W=22, D-Z=19, F-X=6 47
B-X=23, D-W=15, F-Z=9 47
B-X=23, D-Z=19, F-W=5 47
B-Z=26, D-W=15, F-X=6 47
B-Z=26, D-X=16, F-W=5 47
SOLUTION 6
('A', 'D', 'E') 72 - [('U', 'V', 'Z'), ('V', 'X', 'Y')] 25 = 47
('B', 'C', 'F') 67 - [('W', 'X', 'Y'), ('U', 'W', 'Z')] 20 = 47
Section 1
Team 1: A D E + U V Z
A-U=11, D-V=10, E-Z=26 47
A-U=11, D-Z=19, E-V=17 47
A-V=12, D-U=9, E-Z=26 47
A-V=12, D-Z=19, E-U=16 47
A-Z=21, D-U=9, E-V=17 47
A-Z=21, D-V=10, E-U=16 47
Team 2: B C F + W X Y
B-W=22, C-X=23, F-Y=2 47
B-W=22, C-Y=19, F-X=6 47
B-X=23, C-W=22, F-Y=2 47
B-X=23, C-Y=19, F-W=5 47
B-Y=19, C-W=22, F-X=6 47
B-Y=19, C-X=23, F-W=5 47
Section 2
Team 1: A D E + V X Y
A-V=12, D-X=16, E-Y=19 47
A-V=12, D-Y=12, E-X=23 47
A-X=18, D-V=10, E-Y=19 47
A-X=18, D-Y=12, E-V=17 47
A-Y=14, D-V=10, E-X=23 47
A-Y=14, D-X=16, E-V=17 47
Team 2: B C F + U W Z
B-U=16, C-W=22, F-Z=9 47
B-U=16, C-Z=26, F-W=5 47
B-W=22, C-U=16, F-Z=9 47
B-Z=26, C-U=16, F-W=5 47
PS. I've thought of a way to handle much larger data sets quickly. It won't find all solutions, but it should find plenty of them. The idea is to divide the large data set into several smaller sets, solve those smaller sets independently, and then combine the solutions. The only tricky part is to do the division so that each smaller set is solvable. So in each set bad_total < good_total
and bad_total%2 == good_total%2
. It probably makes sense to sort the full data sets by score before attempting the division so that the bad & good groups of each smaller set are roughly matched.