0

I have this loop I've written to find all the combinations of some set of data that I have. When iteration_depth = 3, it takes ~13 minutes to run. My laptop has 2 cores though, so I want to speed it up with multiprocessing, but since I don't really know what I'm doing, the syntax/arguments are tripping me up.

import multiprocessing

def FindAllAffordableLineups():
    all_rosters = []
    roster = [None]*8

    for n in range(int(iteration_depth)):
        for catcher in catcher_pool[0:n]:
            roster = [None]*8
            roster[0] = catcher[0]
            for first_baseman in first_pool[0:n]:
                roster[1] = first_baseman[0]
                for second_baseman in second_pool[0:n]:
                    roster[2] = second_baseman[0]
                    for third_baseman in third_pool[0:n]:
                        roster[3] = third_baseman[0]
                        for shortstop in short_pool[0:n]:
                            roster[4] = shortstop[0]
                            for outfielders in it.combinations(of_pool, 3):
                                roster[5:8] = outfielders[0][0], outfielders[1][0], outfielders[2][0]
                                salaryList = []
                                for player1 in roster:
                                    for player2 in player_pool:
                                        if player1 == player2[0]:
                                            salaryList.append(int(player2[3]))
                                if sum(salaryList) <= remaining_salary:
                                    if len(roster) == len(set(roster)):
                                        all_rosters.append(roster[:])
                                        if len(all_rosters) < 50:
                                            print('Number of possible rosters found: ',len(all_rosters))
                                        if len(all_rosters) == 50:
                                            print("Fifty affordable rosters were found. We're not displaying every time we find another one. That would slow us down a lot.")          
                                        salaryList = []
                                if len(all_rosters) > 10**6:
                                    writeRosters = open(os.path.join('Affordable Rosters.csv'), 'w', newline = '')
                                    csvWriter = csv.writer(writeRosters)
                                    for row in all_rosters:
                                        csvWriter.writerow(row)
                                    writeRosters.close()
                                    all_rosters = []
        writeRosters = open(os.path.join('Affordable Rosters.csv'), 'w', newline = '')
        csvWriter = csv.writer(writeRosters)
        for row in all_rosters:
            csvWriter.writerow(row)
        writeRosters.close()

pool = multiprocessing.Pool(processes=2)
r = pool.map(FindAllAffordableLineups())

This gives me

Traceback (most recent call last):
  File "C:\Users\Owner\Desktop\Multiprocessing\11 - Find Optimal Lineup.py", line 133, in <module>
    r = pool.map(FindAllAffordableLineups())
TypeError: map() missing 1 required positional argument: 'iterable'

In most of the examples I've looked at, the defined function has some argument that needs to be executed inside of the function, and that's the iterable in map.pool command, but my function doesn't require this. How do I fix this?

jbf
  • 171
  • 2
  • 11
  • 1
    This is your second question related to this project you're working on, which is fine, but I don't think you've really explained _what you're trying to do_ well. I have a feeling that a good solution to your problem is simpler than you're making it out to be, but we need a good, clear explanation. I realize this doesn't directly answer this question; I'm just trying to help with the _real_ problem. – Cyphase Aug 19 '15 at 05:32
  • r = pool.map(FindAllAffordableLineups()) so mean r = pool.map(function,iterable) https://docs.python.org/2/library/multiprocessing.html – dsgdfg Aug 19 '15 at 05:32
  • @Cyphase it's more for me to learn Python than to actually accomplish any end. The questions are roadblocks I run into along the way. No doubt there are better ways to do these things. – jbf Aug 19 '15 at 06:02
  • Your structure is quite iterative, would you consider changing it rather than going for multiprocessing? Or are you trying to learn multiprocessing? – zom-pro Aug 19 '15 at 06:46
  • Whatever I do, it's going to have to cycle through a lot of combinations, so I'm going to want to utilize multiprocessing (also I just want to know how to use it). But I'm open to changing the structure. I posted a question a few days ago about the optimization of that big nested loop. I can edit this one with relevant info if you want. I don't know if that would make this too off topic though. – jbf Aug 19 '15 at 12:57

1 Answers1

0

Your code is essentially finding the cartesian product of the lists

[catcher_pool, first_pool, second_pool, third_pool, short_pool] + it.combinations(of_pool, 3)

Refer to this post to see how that can be done nicely: Get the cartesian product of a series of lists?

Then you can make a helper function to filter out invalid lineups

This should get you a reasonable speedup, but if you still want to parallelize, then I would take advantage of the tree-like structure of your code, and pick the first few players sequentially (giving you a list of tuples of the first few players), then map those tuples (in parallel) through the rest of your function.

Community
  • 1
  • 1