In python I am using itertools.product to iterate over all possible combinations of a list of characters that produces a very large result. However when I look at the Windows 10 Task Manager the python process executing this task is only taking 13.5% CPU. I looked into multiprocessing in python, and found that with pool.map I can map an instance of a function to pool, and have multiple instances of the function running in parallel. This is great, but as I am iterating over a single (very large) list and this is done in one instance of a function that takes up a large amount of time, this doesn't help me.
So the way I see it the only way to speed this up is to split the result of itertools.product into groups and iterate over the groups in parallel. If I can get the length of the result itertools.product, I can divide it into groups by the number of processor cores I have available, and then using multiprocessing I can iterate over all these groups in parallel.
So my question is can this be done, and what is the best approach?
Maybe there is a module out there for this sort of thing?
The concept is something like this. (the following actually works but gives MemoryError when I try and scale it up to the full character set commented out):
#!/usr/bin/env python3.5
import sys, itertools, multiprocessing, functools
def process_group(iIterationNumber, iGroupSize, sCharacters, iCombinationLength, iCombintationsListLength, iTotalIterations):
iStartIndex = 0
if iIterationNumber > 1: iStartIndex = (iIterationNumber - 1) * iGroupSize
iStopIndex = iGroupSize * iIterationNumber
if iIterationNumber == iTotalIterations: iStopIndex = iCombintationsListLength
aCombinations = itertools.product(sCharacters, repeat=iCombinationLength)
lstCombinations = list(aCombinations)
print("Iteration#", iIterationNumber, "StartIndex:", iStartIndex, iStopIndex)
for iIndex in range(iStartIndex, iStopIndex):
aCombination = lstCombinations[iIndex];
print("Iteration#", iIterationNumber, ''.join(aCombination))
if __name__ == '__main__':
#_sCharacters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~`!@#$%^&*()_-+={[}]|\"""':;?/>.<,"
_sCharacters = "123"
_iCombinationLength = 4
aCombinations = itertools.product(_sCharacters, repeat=_iCombinationLength)
lstCombinations = list(aCombinations)
_iCombintationsListLength = len(lstCombinations)
iCPUCores = 4
_iGroupSize = round(_iCombintationsListLength / iCPUCores)
print("Length", _iCombintationsListLength)
pool = multiprocessing.Pool()
pool.map(functools.partial(process_group, iGroupSize = _iGroupSize, sCharacters = _sCharacters, iCombinationLength = _iCombinationLength, iCombintationsListLength = _iCombintationsListLength, iTotalIterations = iCPUCores), range(1,iCPUCores+1))
Thanks for your time.