6

I need to limit the CPU usage for the following command since it's using 100% of the CPU.

    from sklearn.linear_model import LinearRegression
    model = LinearRegression(fit_intercept = True, n_jobs = 1)
    model.fit(df_x0, df_y0)
    model.predict(df_x1)

I have set n_jobs == 1, and I did not use multiprocessing, but still it kept CPU fully occupied for all kernels and df_y0. ndim == 1, I learned that the n_jobs would not be effective if so.

Can anyone tell me why it's using 100% of the CPU, and how to solve it in python?

Python 3.7, Linux.

Mario
  • 1,631
  • 2
  • 21
  • 51
Damon
  • 63
  • 1
  • 3

1 Answers1

4

With n_jobs=1 it uses 100% of the CPU of one of the cores. Each process is run in a different core and each process takes the 100% usage of a given core. In Linux with 4 cores It can be clearly seen the CPU usage:

  • (100%,~5%, ~5%, ~5%) when it runs n_jobs=1(if you specify n_jobs to 1, only one core is used).

  • (100%, 100%, 100%, 100%) when running with n_jobs=-1 (if you specify n_jobs to -1, it will use all cores).

also, you can check @kenlukas answer based on his test with scikit-learn 0.20.3 under Linux

Update: To fulfill all scenarios which question Unintended multithreading in Python (scikit-learn) please check out the answers

In case you want to set the number of threads dynamically, and not globally via an environment variable, like:

import mkl
mkl.set_num_threads(2)
Mario
  • 1,631
  • 2
  • 21
  • 51
  • But the truth is, I set n_jobs=1, and it uses all the cores to 100% anyway. I found the solution to it. It was solved by setting OPENBLAS_NUM_THREADS=1. Thank you. – Damon Jan 02 '20 at 03:04
  • @Damon There are probably other options besides `openBLAS` or `MKL` but check this [answer](https://stackoverflow.com/a/48665619/10452700) & [idea behind of it](https://stackoverflow.com/questions/56104472/why-would-setting-export-openblas-num-threads-1-impair-the-performance). Check it out the short answer in **Update** if it works for you. – Mario Jan 02 '20 at 12:36