I have some code for performing some operations using the pathos extension of the multiprocessing library. My question is how to employ a more complex worker function - in this case named New_PP
. How should I format the thpool line to handle a dictionary that my worker function requires in order to give me a result. Python defaults dictionaries to global variables, but within the scope of the worker function I get an error related to this dictionary (access_dict
) not being found, so how can I send in the dictionary or ensure it is available to my worker thread.
Nchunks = 10
thpool = pathos.multiprocessing.ThreadingPool()
mppool = pathos.multiprocessing.ProcessingPool()
Lchunk = int(len(readinfiles) / Nchunks)
filechunks = chunks(readinfiles, 10)
for fnames in filechunks:
files = (open(name, 'r') for name in fnames)
res = thpool.map(mppool.map, [New_PP]*len(fnames), files)
print res[0][0]
And the worker function:
def New_PP(line):
split_line = line.rstrip()
if len(split_line) > 1:
access_dict[4] ....
How can the worker function get at access_dict
?
I have also tried to wrap up my function inside a class as follows:
class MAPPP:
def New_PP(self, line):
self.mytype = access_dict
return my_type
def __init__(self, value_dict):
self.access_dict = access_dict
and:
mapp = MAPPP(value_dict)
print mapp.value_dict
res = thpool.map(mppool.map, [mapp.New_PP]*len(fnames), files)
However I get the same issue.