I'm a new python programmer and have code that manipulates a big number of files, operations like compressing, uncompressing, copying. To improve performance, multiprocessing is used, something like:
pool = Pool(4)
pool.map(do_task, tasks)
There is some execution time savings which dropped from 75 to 55 seconds. Changing the number of processes doesn't seem to have an impact.
I also tried to use multi-thread, the result is about the same. It appears the savings are somehow limited to a certain number no matter what I do.
I have a hard time to figure out why I can't have a bigger saving. I read about terms like CPU-bound or IO-bound, but I don't know how I can practically tell whether that's what I'm running into. Is that something I can check from Activity Monitor? Or what's the suggested approach?