0

When I try to run grid search for XGBClassifier() or sklearn GradientBoostingClassifier I have an error approximately in a 2 minutes after running.

  • if we talk about memory - its about 60% free memory during request processing
  • logistic regression, ligtgbm, catboost and random forest works correctly

So, system kills this process and I don't know why.

my code:

# XGB
xgb = XGBClassifier()
parameters = {'learning_rate': [1e-5, 1e-4, 1e-3, 1e-2, 1e-1, 1],
                  'subsample'    : [0.9, 0.1],
                  'n_estimators' : [20, 200],
                  'max_depth'    : [2,30]
                 }

grid_xgb = GridSearchCV(estimator=xgb, param_grid = parameters, cv = 5, n_jobs=-2, scoring='roc_auc')
xgb_grid = grid_xgb.fit(X_train, y_train)
y_pred=xgb_grid.predict(X_test);
print('roc_auc train:', xgb_grid.best_score_)
print('roc_auc test:', roc_auc_score(y_pred, y_test))
print("\n The best parameters across ALL searched params:\n", grid_xgb.best_params_)
TerminatedWorkerError: A worker process managed by the executor was unexpectedly 
terminated. This could be caused by a segmentation fault while calling the function 
or by an excessive memory usage causing the Operating System to kill the worker.

Warning full text:

 exception calling callback for <Future at 0x1cf50891f90 state=finished raised TerminatedWorkerError>
    Traceback (most recent call last):
      File "C:\Users\...\anaconda3\Lib\site-packages\joblib\externals\loky\_base.py", line 26, in _invoke_callbacks
        callback(self)
      File "C:\Users\...\anaconda3\Lib\site-packages\joblib\parallel.py", line 385, in __call__
        self.parallel.dispatch_next()
      File "C:\Users\...\anaconda3\Lib\site-packages\joblib\parallel.py", line 834, in dispatch_next
        if not self.dispatch_one_batch(self._original_iterator):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\...\anaconda3\Lib\site-packages\joblib\parallel.py", line 901, in dispatch_one_batch
        self._dispatch(tasks)
      File "C:\Users\...\anaconda3\Lib\site-packages\joblib\parallel.py", line 819, in _dispatch
        job = self._backend.apply_async(batch, callback=cb)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\...\anaconda3\Lib\site-packages\joblib\_parallel_backends.py", line 556, in apply_async
        future = self._workers.submit(SafeFunction(func))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\...\anaconda3\Lib\site-packages\joblib\externals\loky\reusable_executor.py", line 176, in submit
        return super().submit(fn, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\...\anaconda3\Lib\site-packages\joblib\externals\loky\process_executor.py", line 1129, in submit
        raise self._flags.broken
    joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

I've already done:

these steps was done with windows rebooting after each installation/reinstallation

but problem didn't disappear

I use windows 10, intel12400, 32gb RAM

desertnaut
  • 57,590
  • 26
  • 140
  • 166
net_95
  • 1
  • 2
  • I think the error comes from `n_jobs=-2`. becayse your system doesn't have enough resources to support this many concurrent processes. Have you tried changing it? – GenZ Aug 18 '23 at 12:28
  • @GenZ, thank you it works for xgb classifier without performance decrease but not working for GradientBoostingClassifier (sklearn) (I tried n_jobs [-1, -2, 12, 10, 5] – net_95 Aug 18 '23 at 15:02
  • Which error did you get? – GenZ Aug 19 '23 at 02:35
  • the same error 'TerminatedWorkerError:..' – net_95 Aug 21 '23 at 10:55
  • I'm not sure about that error, but `pre_dispatch` of `GridSearchCV` controls the number of jobs that get dispatched during parallel execution. You can also try it. @net_95 – GenZ Aug 21 '23 at 11:54

0 Answers0