Problem Description
I have some code I've just started trying to speed up in Python 3.5. I am trying to accomplish this with the multiprocessing
module. Here is a minimum example to demonstrate what I'm trying to do.
Serially, the code is more straightforward. The Momma_Serial
class has a list of Baby
objects inside of it. Occasionally, we want to call the Baby.evolve()
method on each of these. In practice, there are going to be a lot of these Baby
objects (only 100 in this example). This was the original motivation for seeking parallelism.
What complicates this whole thing is that the top level of the program tells how this is done on each of the many Baby
objects by passing a function pass_this_func()
. This function is an argument to Momma_Serial.evolve_all_elems()
, and is passed along to all of the little baby objects inside this momma object.
class Baby:
def __init__(self, lol):
self.lol = lol
def evolve(self, f):
self.lol = f(self.lol)
def pass_this_func(thing):
return( 2 * thing )
class Momma_Serial:
def __init__(self, num):
self.my_list = [Baby(i) for i in range(num)]
def evolve_all_elems(self, the_func):
for baby in self.my_list:
baby.evolve(the_func)
momma1 = Momma_Serial(100)
[baby.lol for baby in momma1.my_list]
momma1.evolve_all_elems(pass_this_func)
[baby.lol for baby in momma1.my_list]
This works as it should. But it's slow. Here's my attempt at re-writing the Momma class using the multiprocessing module.
import multiprocessing as mp
class Momma_MP:
def __init__(self, num):
self.my_list = [Baby(i) for i in range(num)]
def evolve_all_elems(self, the_func):
num_workers = 2
def f(my_obj):
my_obj.evolve(the_func)
with mp.Pool(num_workers) as pool:
pool.map(f, self.my_list)
Then I try to run it:
momma2 = Momma_MP(100)
[baby.lol for baby in momma2.my_list]
momma2.evolve_all_elems(pass_this_func) #error comes here
# [baby.lol for baby in momma2.my_list]
But I get the error:
AttributeError: Can't pickle local object 'Momma_MP.evolve_all_elems.<locals>.f'
An answer to this stackoverflow question states "functions are only picklable if they are defined at the top-level of a module." This statement makes it seem like the only way to accomplish this is by defining a function outside of the Momma_MP
class. But I really don't want to do that, because it would raise a lot more issues for my code.
My Questions ##
(edited a bit)
Is there any workaround? Assume that I cannot define the mapped function outside of the class. Also assume Momma()
is not being instantiated in the __main__
top-level script environment. Also, I don't want to deviate too much from this program design, because I want all these Baby() instances being abstracted away; I don't want the places/programs that instantiate instances or interact with instances of Momma()
having to worry or know about anything to do with the Baby()
class. These extra restrictions make the problem slightly different from the situation here.
By the way, the following doesn't throw an error, but there might be some copying going on, because nothing happens to the constituent Baby objects.
def outside_f(obj):
obj.evolve(pass_this_func)
class Momma_MP:
def __init__(self, num):
self.my_list = [Baby(i) for i in range(num)]
def evolve_all_elems(self, the_func):
num_workers = 2
with mp.Pool(num_workers) as pool:
pool.map(outside_f, self.my_list)
momma2 = Momma_MP(100)
[baby.lol for baby in momma2.my_list]
momma2.evolve_all_elems(pass_this_func)
[baby.lol for baby in momma2.my_list] # no change here?