I have a somewhat large code that uses the libraries numpy, scipy, sklearn, matplotlib
. I need to limit the CPU usage to stop it from consuming all the available processing power in my computational cluster. Following this answer I implemented the following block of code that is executed as soon as the script is run:
import os
parallel_procs = "4"
os.environ["OMP_NUM_THREADS"] = parallel_procs
os.environ["MKL_NUM_THREADS"] = parallel_procs
os.environ["OPENBLAS_NUM_THREADS"] = parallel_procs
os.environ["VECLIB_MAXIMUM_THREADS"] = parallel_procs
os.environ["NUMEXPR_NUM_THREADS"] = parallel_procs
My understanding is that this should limit the number of cores used to 4, but apparently this is not happening. This is what htop
shows for my user and that script:
There are 16 processes, 4 of which show percentages of CPU above 100%. This is an excerpt of lscpu
:
CPU(s): 48
On-line CPU(s) list: 0-47
Thread(s) per core: 2
Core(s) per socket: 12
Socket(s): 2
I am also using the multiprocessing
library down the road in my code. I set the same number of processes using multiprocessing.Pool(processes=4)
. Without the block of code shown in above, the script insisted on using as many cores as possible apparently ignoring multiprocessing
entirely.
My questions are then: what am I limiting when I use the code above? How should I interpret the htop
output?