Memory leak using gridsearchcv

Question

Problem: My situation appears to be a memory leak when running gridsearchcv. This happens when I run with 1 or 32 concurrent workers (n_jobs=-1). Previously I have run this loads of times with no trouble on ubuntu 16.04, but recently upgraded to 18.04 and did a ram upgrade.

import os
import pickle
from xgboost import XGBClassifier
from sklearn.model_selection import GridSearchCV,StratifiedKFold,train_test_split
from sklearn.calibration import CalibratedClassifierCV
from sklearn.metrics import make_scorer,log_loss
from horsebet import performance
scorer = make_scorer(log_loss,greater_is_better=True)
kfold = StratifiedKFold(n_splits=3)

# import and split data
input_vectors = pickle.load(open(os.path.join('horsebet','data','x_normalized'),'rb'))
output_vector = pickle.load(open(os.path.join('horsebet','data','y'),'rb')).ravel()
x_train,x_test,y_train,y_test = train_test_split(input_vectors,output_vector,test_size=0.2)


# XGB
model = XGBClassifier()
param = {
        'booster':['gbtree'],
        'tree_method':['hist'],
       'objective':['binary:logistic'],
        'n_estimators':[100,500],
        'min_child_weight': [.8,1],
        'gamma': [1,3],
        'subsample': [0.1,.4,1.0],
        'colsample_bytree': [1.0],
        'max_depth': [10,20],
        }                           

jobs = 8
model = GridSearchCV(model,param_grid=param,cv=kfold,scoring=scorer,pre_dispatch=jobs*2,n_jobs=jobs,verbose=5).fit(x_train,y_train)

Returns: UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak. "timeout or by a memory leak.", UserWarning

OR

TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker. The exit codes of the workers are {SIGKILL(-9)}

My current hardware is a 16 core threadripper with 32gb of 3Mhz ram. The data files total 100mb. — negfrequency, Apr 25 '19 at 11:25

score 56 · Accepted Answer · answered Apr 28 '19 at 02:02

56

The cause of my issue was that i put n_jobs=-1 in gridsearchcv, when it should be placed in the classifier. This has solved the issue.

answered Apr 28 '19 at 02:02

negfrequency

1,801
3
18
30

5

I came across the same problem while I was doing GridSearch of xgboost in AWS SageMaker. Removing n_jobs=-1 in GridSearchCV solved the issue too. – CathyQian Aug 16 '19 at 17:57
where can I see the parameters of KerasClassifier? – Ben Oct 10 '19 at 12:28
Where in the classifier? I see no n_jobs argument... – Caterina Jul 20 '22 at 16:07

score 1 · Answer 2 · answered Oct 05 '22 at 13:33

1

model = XGBClassifier(n_jobs) and you need to remove the n_jobs args in the GridSearchCV

answered Oct 05 '22 at 13:33

Lévi Bernadine

11
1

score -3 · Answer 3 · answered Jan 13 '20 at 01:02

Though, its not entirely same issue, I have run into same error with skopt gp_minimize() method. Even though the documentation says gp_minimize() supports n_jobs, it started failing on my mac. When I moved it n_jobs to the underlying XGBClassifier it worked fine.

This did not work

gp_minimize(_minimize, param_space, n_calls=20, n_random_starts=3, random_state=2405)

This worked

xgb = xgboost.XGBClassifier(
        n_estimators=1000, # use large n_estimators deliberately to make use of the early stopping
        objective='binary:logistic',
        n_jobs=-1
    )

Memory leak using gridsearchcv

3 Answers3

Linked