Your description suggests that a simple thread (or process) pool would work:
#!/usr/bin/env python
from multiprocessing.dummy import Pool # thread pool
from tqdm import tqdm # $ pip install tqdm # simple progress report
def mp_process_image(filename):
try:
return filename, process_image(filename), None
except Exception as e:
return filename, None, str(e)
def main():
# consider every non-blank line in the input file to be an image path
image_paths = (line.strip()
for line in open('image_paths.txt') if line.strip())
pool = Pool() # number of threads equal to number of CPUs
it = pool.imap_unordered(mp_process_image, image_paths, chunksize=100)
for filename, result, error in tqdm(it):
if error is not None:
print(filename, error)
if __name__=="__main__":
main()
I assume that process_image()
is CPU-bound and it releases GIL i.e., it does the main job in a C extension such OpenCV. If process_image()
doesn't release GIL then remove the word .dummy
from the Pool
import to use processes instead of threads.