I am using multiprocessing pool to create multiple processes to run the program. I am loading two heavyweight objects (each around 3.5 GB size on disk) into the parent process which are then used by the pooled worker processes to generate the output (Linux System, so copy on write mechanism is used). Every pooled process writes to a single file that is shared across all the processes. My question is that on a 36 core system, I gain in performance upto a certain number of pooled processes (Estimated to be 10). When I exceed that number, the execution of the parts of the program that use those heavy objects starts taking more and more time and performance gain from multiprocessing is lost. Is there any specific science to this phenomena or is it always guaranted to have performance gain with more pooled processes? Thanks.
Asked
Active
Viewed 107 times
1
-
1Probably because you are running out of native thread count as the process count grows. You should perhaps bind each process to specific core. (Leave some for system and other processes) – bhathiya-perera Jan 07 '19 at 09:43
-
Possible duplicate of https://stackoverflow.com/questions/25058006/python-pool-map-and-choosing-number-of-processes – bhathiya-perera Jan 07 '19 at 09:45