I am new to the futures module and have a task that could benefit from parallelization; but I don't seem to be able to figure out exactly how to setup the function for a thread and the function for a process. I'd appreciate any help anyone can shed on the matter.
I'm running a particle swarm optimization (PSO). Without getting into too much detail about PSO itself, here's the basic layout of my code:
There is a Particle
class, with a getFitness(self)
method (which computes some metric and stores it in self.fitness
). A PSO simulation has multiple particle instances (easily over 10; 100s or even 1000s for some simulations).
Every so often, I have to compute the fitness of the particles. Currently, I do this in for-loop:
for p in listOfParticles:
p.getFitness(args)
However, I notice that the fitness of each particle can be computed independently of each other. This makes this fitness computation a prime candidate for parallelization. Indeed, I could do map(lambda p: p.getFitness(args), listOfParticles)
.
Now, I can easily do this with futures.ProcessPoolExecutor
:
with futures.ProcessPoolExecutor() as e:
e.map(lambda p: p.getFitness(args), listOfParticles)
Since the side-effects of calling p.getFitness
are stored in each particle itself, I don't have to worry about getting a return from futures.ProcessPoolExecutor()
.
So far, so good. But now I notice that ProcessPoolExecutor
creates new processes, which means that it copies memory, which is slow. I'd like to be able to share memory - so I should be using threads. That's well and good, until I realize that running several processes with several threads inside each process will likely be faster, since multiple threads still run only on one processor of my sweet, 8-core machine.
Here's where I run into trouble:
Based on the examples I've seen, ThreadPoolExecutor
operates on a list
. So does ProcessPoolExecutor
. So I can't do anything iterative in ProcessPoolExecutor
to farm out to ThreadPoolExecutor
because then ThreadPoolExecutor
is going to get a single object to work on (see my attempt, posted below).
On the other hand, I cant slice listOfParticles
myself, because I want ThreadPoolExecutor
to do its own magic to figure out how many threads are required.
So, the big question (at long last):
How should I structure my code so that I can effectively parallelize the following using both processes AND threads:
for p in listOfParticles:
p.getFitness()
This is what I've been trying, but I wouldn't dare try to run it, for I know it won't work:
>>> def threadize(func, L, mw):
... with futures.ThreadpoolExecutor(max_workers=mw) as executor:
... for i in L:
... executor.submit(func, i)
...
>>> def processize(func, L, mw):
... with futures.ProcessPoolExecutor() as executor:
... executor.map(lambda i: threadize(func, i, mw), L)
...
I'd appreciate any thoughts on how to fix this, or even on how to improve my approach
In case it matters, I'm on python3.3.2