I am trying to write a python class to read the a number of images in parallel using multiprocessing.Pool
and threading.Lock
. My approach is to create a pool of threads which each thread will read an image and append to a member variable with list type. It will also provide a function to obtain the list when it finished reading all the images.
class ReadFilePool(object):
# filenames contains a list of image absolute paths
def __init__(self, filenames):
self.filenames = filenames
self.images = []
self.lock = threading.Lock()
self.pool = Pool(processes=len(self.filenames))
self.pool.map(self.read_file, [filename for filename in self.filenames])
def read_file(self, filename):
image = get_image(filename)
self.lock.acquire()
self.images.append(image)
self.lock.release()
def get_images(self):
images = None
self.lock.acquire()
if len(self.filenames) == len(self.images):
images = self.images
self.lock.release()
return images
Then I will try to loop and check if get_images
is not None and process the images, e.g.
images = []
completed = False
pool = ReadFilePool(filenames)
while not completed:
images = pool.get_images()
completed = (None == images)
# ...process the images
I tried to use the following approaches but I still got the pickle errors like TypeError: can't pickle _thread.lock objects
Approach 1: __setstate__ and __getstate__
Approach 2: __call__
Unfortunately I am not too familiar with python multithreading
and Lock
and encountered few pickle related errors. Please kindly suggest the correct way to use these classes.