I'm trying to parallelize some code that uses partial functions to generate random numbers for a simulation I'm working on. With the following code:
#!/usr/bin/env python3
import functools
import random
import pathos
from itertools import starmap
from time import sleep
from datetime import datetime
def example(func1, func2):
sleep(1)
[a, b] = [func1(), func2()]
return (f"arg #1 is {round(a,2)}, arg #2 is {round(b,2)} at {datetime.now().time()}")
rand1 = functools.partial(random.uniform, 100, 199)
rand2 = functools.partial(random.uniform, 200, 299)
rand3 = functools.partial(random.uniform, 300, 399)
argsToRun = [(rand1, rand2), (rand2, rand3), (rand1, rand3)] # 3 ordered combinations...
print(f"running with a for loop...")
for args in argsToRun:
result = example(*args)
print(result)
print("\nRunning with itertools.starmap...")
results = starmap(example, argsToRun)
print("\n".join(results))
print("\nRunning with pathos.mp.starmap...")
with pathos.helpers.mp.Pool() as pool:
results = pool.starmap(example, argsToRun)
print("\n".join(results))
I get the following output...
running with a for loop...
arg #1 is 134.5, arg #2 is 232.45 at 11:58:17.025493
arg #1 is 213.38, arg #2 is 306.7 at 11:58:18.027038
arg #1 is 107.3, arg #2 is 347.19 at 11:58:19.028476
Running with itertools.starmap...
arg #1 is 167.7, arg #2 is 247.96 at 11:58:20.030238
arg #1 is 235.97, arg #2 is 318.02 at 11:58:21.031543
arg #1 is 140.41, arg #2 is 387.51 at 11:58:22.032727
Running with pathos.mp.starmap...
arg #1 is 120.24, arg #2 is 208.23 at 11:58:23.100251
arg #1 is 220.24, arg #2 is 308.23 at 11:58:23.112206
arg #1 is 120.24, arg #2 is 308.23 at 11:58:23.126050
The problem is that when I parallelize this, the random functions are NOT evaluated differently each time. They're only being evaluated once (or the result is somehow getting reused over and over...) if you look at the last block, the values from the random functions passed in are not changing. I put the timestamps in there to convince myself that the last block actually WAS being executed in parallel.
I'm sure it has something to do with when and how the tuple-parameters for the function calls are evaluated, but at this point I'm lost.
The very high level goal is to be able to build a (very large) list of parameters to pass into a simPy environment, and have the pool execute on them in parallel. But until I can figure out how to get the randomness to work, I'm stuck doing this at 1/32 of the speed I need.