To start with, here is some code that works
from multiprocessing import Pool, Manager
import random
manager = Manager()
dct = manager.dict()
def do_thing(n):
for i in range(10_000_000):
i += 1
dct[n] = random.randint(0, 9)
with Pool(2) as pool:
pool.map(do_thing, range(10))
Now if I try to make a class out of this:
from multiprocessing import Pool, Manager
import random
class SomeClass:
def __init__(self):
self.manager = Manager()
self.dct = self.manager.dict()
def __call__(self):
with Pool(2) as pool:
pool.map(self.do_thing, range(10))
def do_thing(self, n):
for i in range(10_000_000):
i += 1
self.dct[n] = random.randint(0, 9)
if __name__ == '__main__':
inst = SomeClass()
inst()
I run into: TypeError: Pickling an AuthenticationString object is disallowed for security reasons
. Now from here, I get the hint that Python is trying to pickle the Manager
which as I understand has its own dedicated process, and processes can't be pickled because they contain an AuthenticationString
.
I don't know enough about how forking works (I'm on Linux, so I understand this is the default method for starting new processes) to understand exactly why the Manager
instance needs to be pickled.
So here are my questions:
- Why is this happening?
- How can I use a
Manager
when doing multiprocessing within a class? PS: I want to be able to import SomeClass from this module. - Is what I'm asking for unreasonable or unconventional?
PS: I know I can do this exact snippet without the Manager
by exploiting the fact that pool.map
will return things in order, so something like this: res = pool.map(self.do_thing, range(10))
then dct = {k: v for k, v in zip(range(10), res)}
. But that's besides the point of the question.