I am working on a project that requires me to extract a ton of information from some files. The format and most of the information about the project does not matter for what I am about to ask. I mostly do not understand how I would share this dictionary with all the processes in the process pool.
Here is my code (changed up variable names and deleted most of the code to just the need to know parts):
import json
import multiprocessing
from multiprocessing import Pool, Lock, Manager
import glob
import os
def record(thing, map):
with mutex:
if(thing in map):
map[thing] += 1
else:
map[thing] = 1
def getThing(file, n, map):
#do stuff
thing = file.read()
record(thing, map)
def init(l):
global mutex
mutex = l
def main():
#create a manager to manage shared dictionaries
manager = Manager()
#get the list of filenames to be analyzed
fileSet1=glob.glob("filesSet1/*")
fileSet2=glob.glob("fileSet2/*")
#create a global mutex for the processes to share
l = Lock()
map = manager.dict()
#create a process pool, give it the global mutex, and max cpu count-1 (manager is its own process)
with Pool(processes=multiprocessing.cpu_count()-1, initializer=init, initargs=(l,)) as pool:
pool.map(lambda file: getThing(file, 2, map), fileSet1) #This line is what i need help with
main()
From what I understand, that lamda function should work. The line that i need help with is: pool.map(lambda file: getThing(file, 2, map), fileSet1). It give me an error there. The error given is "AttributeError: Cant pickle local object 'main..'".
Any help would be appreciated!