How to set `n_jobs` in sklearn ElasticNet

Question

I am trying to run the ElasticNet function from scikit-learn on a machine with multiple CPUs. However, I need the ElasticNet fit to only use one CPU, since I need to run other fitting routines in parallel on the remaining CPUs. Whenever the thread containing ElasticNet starts the fit, it quickly takes over any free space on all CPUs instead of just the one its called on. Because other code routines are running on these machines already, ElasticNet oversubscribes the machines and slows everything down tremendously, including itself. I need these routines to run in parallel, so I cannot just run the ElasticNet fit serially ahead of time.

Unlike other regression functions (linear, logistic...) in sklearn there is no n_jobs argument for ElasticNet. Reading the documentation, it appears that ElasticNet defaults to the n_jobs specified in joblib.parallel_backend which itself defaults to n_jobs=-1, which is all available CPUs.

I am trying to figure out the proper method for specifying n_jobs in parallel_backend so that it will override the default for ElasticNet. Following are three attempts to change n_jobs that have not worked so far.

Attempt 1

from joblib import parallel_backend
from sklearn.linear_model import ElasticNet

with parallel_backend('loky', n_jobs=1):
                    
    model = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, fit_intercept=False, 
                       normalize=False, copy_X=True, max_iter=10000, tol=10,
                       random_state=42, precompute=False, warm_start=False,
                       positive=False, selection='cyclic')
    model.fit(predictors, response)

Attempt 2

from sklearn.utils import parallel_backend
from sklearn.linear_model import ElasticNet

with parallel_backend('loky', n_jobs=1):
                    
    model = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, fit_intercept=False, 
                       normalize=False, copy_X=True, max_iter=10000, tol=10,
                       random_state=42, precompute=False, warm_start=False,
                       positive=False, selection='cyclic')
    model.fit(predictors, response)

Both Attempt 1 and Attempt 2 do not throw any errors, but also do not appear to change n_jobs from the default of using every available CPU. ElasticNet still takes over all available CPU space across all CPUs and quickly oversubscribes the machines.

Attempt 3

This is my first time using joblib directly, and so I've been reading the documentation on parallelization with joblib. Most of the example routines placed in the parallel_backend container are prefaced with the Parallel() helper class. Following the examples, I modified Attempt 1 in the following way:

from joblib import parallel_backend
from sklearn.linear_model import ElasticNet

with parallel_backend('loky', n_jobs=1):
                    
    model = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, fit_intercept=False, 
                       normalize=False, copy_X=True, max_iter=10000, tol=10,
                       random_state=42, precompute=False, warm_start=False,
                       positive=False, selection='cyclic')
    Parallel(n_jobs=1)(model.fit(predictors, response))

However when running Attempt 3, I get the following error message:

TypeError: 'ElasticNet' object is not iterable

Does anyone know how to set n_jobs=1 for sklearn's ElasticNet? There must be some way to do this because ElasticNetCV has n_jobs as a possible argument. Any help with this is greatly appreciated!

May the accepted answer from [here](https://stackoverflow.com/questions/65377950/how-do-i-restrict-the-number-of-processors-used-by-the-ridge-regression-model-in/) help? — amiola, Nov 17 '21 at 08:05
Thanks amiola! I can't figure out a way to set the global variable since I am running python on all machines, and I don't want to force those jobs to run serially on just the one. — morepenguins, Nov 17 '21 at 16:18

user1635327 · Answer 1 · 2021-11-19T18:49:45.553

0

In addition to the first solution, you can manually set MKL_NUM_THREADS=1 by

import mkl
mkl.set_num_threads(1)

edited Nov 19 '21 at 18:49

answered Nov 19 '21 at 14:55

user1635327

1,469
3
11

How to set `n_jobs` in sklearn ElasticNet

1 Answers1