In a cluster with 96 cores, I need to run a python function, that I called readInfo(filename)
which read information from files. I have 90 files to read (each one requires 15 minutes to be read) and I used multiprocessing.Pool(processes=90)
to parallelize the process on multiple threads. When I check from terminal with htop
, I see only few cores running instead of 90. Why?
Asked
Active
Viewed 122 times
0

Byba
- 35
- 9
-
1I guess all 90 are on one hard drive? It cannot read 90 files in parallel. – zvone Dec 11 '22 at 12:35
-
1may be since you have those files stored on the same disk, you can't make use of the processing power you have. – Mohammed Ibrahim Dec 11 '22 at 12:36
-
Did you try `multi-processing` or `multi-threading`? See https://stackoverflow.com/questions/18114285/what-are-the-differences-between-the-threading-and-multiprocessing-modules – Cpt.Hook Dec 11 '22 at 13:02
-
1Seeing at least the outline of some code might help us debug. – Frank Yellin Dec 11 '22 at 18:48
-
We don't need to see your whole code, but seeing how you set up multiprocessing and start the jobs would help. – Frank Yellin Dec 12 '22 at 18:43