I am trying to do multiprocessing on some files. The files belong to two different categories namely A
and B
. The idea is to iterate over all the files where the filenames are contained in two lists category_A
and category_B
. Here is a simple class that I have written for doing multiprocessing:
class ImageProc:
def __init__(self):
self.data = []
def __call__(self, sample, category="A"):
subject = str(sample)
sample = loadmat(sample)
age = int(sample["Age"][0][0])
nb_images = sample['images'].shape[2]
del sample
for i in range(nb_images):
self.data.append((subject, category, age, i))
gc.collect()
# This works fine for category A
proc = ImageProc()
pool = Pool()
_ = pool.map(proc, category_A)
But now I want to use the same instance and call the same function for category_B
for which I have to explicitly pass the argument category="B"
in the __call__
method. Can anyone please help me how to achieve that?
EDIT: Given the useful comments, I also want to elaborate that this list, represented here by self.data
, is common for both category_A
and category_B
. If I make it global, then it won't be possible use the pool instances on the same list to write data.