0

I have a python script where I use itertools.combinations to select a number of combinations within a fixed range.

To give an example, I have a csv file that contains a list of golfers and they each have a ranking between 1 and 3

Seamus Power,10500,1
Brian Harman,10300,2
Tom Hoge,9800,1
Taylor Montgomery,9600,1
Jason Day,9400,1
Keith Mitchell,9300,2
Joel Dahmen,9200,1
Denny McCarthy,9100,1
Matthew NeSmith,9000,1
Matt Kuchar,8900,1
Mackenzie Hughes,8600,1
Davis Riley,8100,3
Brendon Todd,8000,3
Andrew Putnam,7900,1
J.J. Spaun,7800,3
Aaron Rai,7800,1
Harris English,7700,2
Will Gordon,7700,1
Greyson Sigg,7500,1
J.T. Poston,7500,1
Seonghyeon Kim,7500,3
Troy Merritt,7400,1
Sepp Straka,7200,1
Justin Suh,7200,2
Adam Long,7100,1
Kevin Streelman,7000,2
Ben Taylor,7000,3
John Huh,6900,2
Austin Cook,6800,2
David Lingmerth,6700,3

In my code I then place each golfer into it's own list based on it's ranking and then in my tertools.combinations I state give me a set amount per tier. As I want 6 golfers (and it's got to be 6 golfers to create my lineup) in the example below I state give me 2 combination of golfers from tier 1 and 4 combination of golfers from tier 2 and set a lineup with them like so:

    tier1Range = []
    tier2Range = []
    tier3Range = []
    
    with open('combinations.csv', 'r') as file:
        reader = csv.reader(file)
        for row in reader:
            if int(row[2]) == 1:
                tier1Range.append(row)
            elif int(row[2]) == 2:
                tier2Range.append(row)
            elif int(row[2]) == 3:
                tier3Range.append(row)         
    
    lineup = []
    finalLineup = []
    finalRandomLineup = []
    
    #Edit below to fit creiteria of which ranges and how many per range
    sel_tier1 = itertools.combinations(tier1Range,2) 
    sel_tier2 = itertools.combinations(tier2Range,4)
    
    lineup = [p + q for p,q in itertools.product(sel_tier1,sel_tier2)]

finalLineup = [x for x in lineup if int(x[0][1]) + int(x[1][1]) + int(x[2][1]) + int(x[3][1]) + int(x[4][1]) + int(x[5][1]) >= 49700 
               and int(x[0][1]) + int(x[1][1]) + int(x[2][1]) + int(x[3][1]) + int(x[4][1]) + int(x[5][1]) <= 50000]

finalRandomLineup += ((random.sample(finalLineup,20)))

totalSalary = int
with open('temp.csv', 'w') as f:
    for line in finalRandomLineup:
        totalSalary = int(line[0][1]) + int(line[1][1]) + int(line[2][1]) + int(line[3][1]) + int(line[4][1]) + int(line[5][1])
        f.write(line[0][0] + ',' + line[1][0] + ',' + line[2][0] + ',' + line[3][0] + ',' + line[4][0] + ',' + line[5][0] + ' ^ total salary ' + str(totalSalary)+ '\n\n')

However, I want my combinations to be more dynamic than that and this is my question. I want to have the ability to states I want a 2-3 golfers from tier 1 and 3-4 golfers from tier 2 as an example.

So rather than a fixed range of 2 from tier 1 and 4 from tier 2, I want the ability to pick combinations of 2 or 3 golfers from tier 1 and combinations of 3 or 4 golfers from tier 2.

At the end when the lineup is created, it has to be a max of 6 golfers. So I can't have 3 golfers from tier 1 combined with 4 golfers from tier 2 as it goes over the threshold.

Does anybody know the best way to achieve this?

EDIT

I updated the code above to include salary, where the lineups it generates has to be between the salary threshold of 49700 and 50000 based on the second column from the csv (which has been updated). So the csv goes, golfer name, salary, rank per row.

So above it picks 2 golfers from tier 1, and 4 from tier 2 and then when it creates my lineups, it only picks those where the combined salary of all 6 golfers golfers is between 49700 and 50000

BruceyBandit
  • 3,978
  • 19
  • 72
  • 144
  • 2
    Flipping a coin (`random.randrange(2) == 0`) to choose between a 2+4 setup or 3+3 setup and then computing accordingly would satisfy what you've said. – Kache Jun 22 '23 at 22:58
  • what do you mean by dynamic? – darth baba Jun 22 '23 at 23:02
  • @darthbaba May be wrong word but I meant not a fixed single value. So want the possibility to say create either this number of combinations or that number of combinations for a tier, rather than a single number – BruceyBandit Jun 22 '23 at 23:07
  • @Kache Are you ok sharing a sample code with your suggestion so I can visually see how you would implement that? – BruceyBandit Jun 22 '23 at 23:08
  • You want to randomly create these combinations? or you want to input your combinations? – darth baba Jun 22 '23 at 23:08
  • @darthbaba Randomly. I don't have it in the snippet above but I do `finalRandomLineup += ((random.sample(lineup,20))` currently. When I change the number 20 to whichever number I want to say this many lineups to generate – BruceyBandit Jun 22 '23 at 23:11

2 Answers2

1

Without looking too deeply into your code and assuming it already does what you want in the 2+4 case, an example of what I'm suggesting in my comment is:

if random.randrange(2) == 0:  # 50/50 chance
    tier1_size, tier2_size = 2, 4
else:
    tier1_size, tier2_size = 3, 3

sel_tier1 = itertools.combinations(tier1Range, tier1_size) 
sel_tier2 = itertools.combinations(tier2Range, tier2_size)

so half of the time it'll create 2+4 lineups and half of the time it'll create 3+3 lineups


An update, based on your additional descriptions. In this simplified example, a "team profile" is defined only by tier1_size b/c team size is fixed at 6.

lineups_for() generates all possible lineups given some profile, and all_lineups is the combined lineups of all profiles requested via tier1_sizes_to_try.

This is quick to write but not particularly efficient b/c it generates many combinations that are ultimately thrown away. A more efficient method would sample iteratively, similar to reservoir sampling, e.g. Algorithm to select a single, random combination of values?

import itertools as it
import random

# using integers as placeholders
tir1Range = [0, 2, 4, 6, 8, 10, 12]
tier2Range = [1, 3, 5, 7, 9, 11, 13]

TEAMSIZE = 6
tier1_sizes_to_try = [2, 3]  # take this as input


def lineups_for(tier1_size):
    tier2_size = TEAMSIZE - tier1_size
    sel_tier1 = it.combinations(tier1Range, tier1_size)
    sel_tier2 = it.combinations(tier2Range, tier2_size)

    return [p + q for p, q in it.product(sel_tier1, sel_tier2)]


all_lineups = [lineup for tier1_size in tier1_sizes_to_try for lineup in lineups_for(tier1_size)]

random.sample(all_lineups, 10)
# For example:
#  [
#    (4, 6, 12, 5, 11, 13),
#    (0, 2, 12, 3, 9, 11),
#    (0, 6, 12, 1, 3, 5),
#    (2, 6, 10, 1, 5, 11),
#    (8, 12, 5, 7, 9, 13),
#    (6, 10, 12, 1, 3, 9),
#    (0, 6, 12, 5, 7, 9),
#    (4, 6, 1, 3, 9, 13),
#    (0, 2, 4, 3, 9, 13),
#    (0, 10, 7, 9, 11, 13),
#  ]
Kache
  • 15,647
  • 12
  • 51
  • 79
  • This method would only generate lineups from one set of combination, i.e. if one is 2+4, so too will all others. (Responding to (now-deleted) comment from OP) – Mous Jun 22 '23 at 23:14
  • Since you didn't really specify, my code will choose only one lineup profile and then generate lineups for that profile. Are you having trouble with generating two lineups, one for each profile, combining them, then sampling from the combined collection? – Kache Jun 22 '23 at 23:15
  • @Kache It's tricky to explain but I will do my best. I added some additional code to my question code block so you can see. Currently in that code block I can create 20 random lineups which consist of golfers where 2 are from tier 1 and 4 from tier 2. But I may also want to include 3 from tier 1 and 3 from tier 2 and include them in my final random lineup. So it could be 10 lineups containing 2 and 4, and 10 from 3 and 3...or 6 from 2 and 4 and 14 from 3 and 3. It doesn't matter on how many there are, it's having the capability of at least generating lineups from these 2 possible combinations – BruceyBandit Jun 22 '23 at 23:21
  • @Kache that's why Ideally I would like a range and then the script is smart enough to know how to generate each lineup of 6 golfers from different ranges. It could be give me 2 or 3 from tier 1, 3 or 4 from tier 2, 1 or 0 from tier 3 etc. And then the script is like ok here are all the possible combinations based on the criteria with max 6 golfers. So lineups will include 2, 4, 0... 3, 3, 0 ... 2, 3, 1 etc – BruceyBandit Jun 22 '23 at 23:24
  • I'm writing a script to do that right now. – Mous Jun 22 '23 at 23:25
  • I still don't understand if the final list has only 6 golfers then there are only 15 combinations possible of 2 golfers each. Are you selecting 6 golfers and then taking a combination? – darth baba Jun 22 '23 at 23:26
  • @darthbaba Well actually I cut down the code. In reality the CSV file has more golfers and include a salary. It's for DFS golf where a lineup must contain 6 golfers and the budget is $50k. So each golfer is priced. I will update the code so you can see what it looks like fully. I also set up my final lineups so the cost of each lineup is between 49700 and 50000 – BruceyBandit Jun 22 '23 at 23:30
  • @Mous Thank you – BruceyBandit Jun 22 '23 at 23:30
  • Do all the 6 golfer play with each other? then we have 15 combinations possible, how are you selecting 20? – darth baba Jun 22 '23 at 23:37
  • Added an update to cover what I understand from what you're saying. – Kache Jun 22 '23 at 23:41
  • @darthbaba Updated code and CSV file which is much bigger. If I enter in a number that's too high (like 20 when there's 15), my script would bomb out and say I have requested more than there is possible, and then I have to decrease until it is able to write to file due to the number being on the dot or below the max combinations possible – BruceyBandit Jun 22 '23 at 23:42
  • @Kache Just giving it a read and attempting to test it on mine :) – BruceyBandit Jun 22 '23 at 23:47
  • @Mous Would like to compare with your script also once ready to see which one would work best – BruceyBandit Jun 22 '23 at 23:50
  • What is the problem with this answer, I think it solves your problem – darth baba Jun 22 '23 at 23:56
  • @BruceyBandit my script is included below. – Mous Jun 23 '23 at 00:16
1

I would probably write your code like this.

import itertools
import random
import csv



tier_ranges = {1: range(2, 3 + 1), 2: range(3, 4 + 1)}

max_in_lineup = 6

n_lineups = 20

salary_range = range(49700, 50000+1)



tiers = {1: [], 2: [], 3: []}

golfers = {}

#  load in tiers
with open('combinations.csv') as file:
    reader = csv.reader(file)
    for row in reader:
        tiers[int(row[2])].append(row[0]) # tier
        golfers[row[0]]=int(row[1]) # cost

# generate all lineups
possible_combinations = itertools.product(*tier_ranges.values())

#filter out the lineups with more than six golfers
filtered_combinations = [i for i in possible_combinations if sum(i) <= max_in_lineup]

#for each of those, generate all the names as a 2d list
all_combs = [list(itertools.product(*[itertools.combinations(tiers[tier],n) for tier, n in zip(tiers.keys(), comb)])) for comb in filtered_combinations]

#flatten the list
temp_lineups = [b for a in all_combs for b in a]

# flatten the sublists
all_lineups = [[k for j in i for k in j] for i in temp_lineups]

# filter to make sure prices work
final_lineups = [i for i in all_lineups if sum(map(golfers.__getitem__, i)) in salary_range]

# example output
lineups = random.sample(final_lineups, n_lineups)
print('\n'.join(map(', '.join,lineups)))

Output:

Jason Day, Will Gordon, Troy Merritt, Brian Harman, Harris English, Justin Suh
Taylor Montgomery, Jason Day, Will Gordon, Keith Mitchell, Justin Suh, Austin Cook
Seamus Power, Jason Day, Matthew NeSmith, Justin Suh, Kevin Streelman, John Huh
Mackenzie Hughes, J.T. Poston, Troy Merritt, Brian Harman, Keith Mitchell, Austin Cook
Taylor Montgomery, Matt Kuchar, Troy Merritt, Brian Harman, Kevin Streelman, Austin Cook
Matthew NeSmith, Mackenzie Hughes, Brian Harman, Harris English, Justin Suh, Kevin Streelman
Tom Hoge, Will Gordon, Troy Merritt, Brian Harman, Harris English, Kevin Streelman
Seamus Power, Taylor Montgomery, Matt Kuchar, Justin Suh, Kevin Streelman, Austin Cook
Seamus Power, Mackenzie Hughes, Troy Merritt, Keith Mitchell, Justin Suh, Kevin Streelman
Denny McCarthy, Sepp Straka, Adam Long, Brian Harman, Keith Mitchell, Austin Cook
Seamus Power, Greyson Sigg, Sepp Straka, Brian Harman, Harris English, Austin Cook
Matt Kuchar, Troy Merritt, Brian Harman, Keith Mitchell, Kevin Streelman, Austin Cook
Taylor Montgomery, Matt Kuchar, Aaron Rai, Keith Mitchell, Justin Suh, Kevin Streelman
Seamus Power, Greyson Sigg, J.T. Poston, Brian Harman, Justin Suh, Austin Cook
Joel Dahmen, Denny McCarthy, Troy Merritt, Brian Harman, Justin Suh, Austin Cook
Joel Dahmen, Aaron Rai, Will Gordon, Brian Harman, Harris English, Kevin Streelman
Seamus Power, Denny McCarthy, Sepp Straka, Keith Mitchell, Kevin Streelman, John Huh
Jason Day, Denny McCarthy, Sepp Straka, Brian Harman, Justin Suh, Austin Cook
Joel Dahmen, Troy Merritt, Brian Harman, Keith Mitchell, John Huh, Austin Cook
Matthew NeSmith, Matt Kuchar, Aaron Rai, Brian Harman, Kevin Streelman, Austin Cook

Alternatively, you could write it as a one-liner, like

final_lineups = [i for i in [[k for j in i for k in j] for i in [b for a in [list(itertools.product(*[itertools.combinations(tiers[tier],n) for tier, n in zip(tiers.keys(), comb)])) for comb in [i for i in itertools.product(*tier_ranges.values()) if sum(i) <= max_in_lineup]] for b in a]] if sum(map(golfers.__getitem__, i)) in salary_range]

But I've elected to express it in the slightly longer form above so that the intermediate steps are apparent.

Mous
  • 953
  • 3
  • 14