2

I am tunning the parameters of a XGBoost Regressor using a custom cross validation method. One of the parameters that I am using is the number of trees (n_estimators) and I am also using early_stopping_rounds so the training can stop.

The problem is that in the end, I have a different classifier for each fold during the cross validation. For example, suppose I am training using n_estimators=100 and early_stopping_rounds=20; in one fold I could have completed the training without early stopping but in the next iteration the training could have stopped at the 30th iteration, having n_estimators=30.

How should I proceed?

  • 2
    Don't use early stopping in your CV: https://stackoverflow.com/questions/48127550/early-stopping-with-keras-and-sklearn-gridsearchcv-cross-validation/48139341#48139341 – desertnaut Nov 11 '21 at 23:32
  • So what do you suggest? To do grid search with a huge n_estimators range? Eg [5, 10, 20, ... 100, 200, 300, 500, 1000]. It will be extremely slow. – rriccilopes May 13 '22 at 11:46

0 Answers0