I'm having problems using all cores on computer for training and cross-validation of XGBoost model.
Data:
data_dmatrix = xgb.DMatrix(data=X,label=y, nthread=-1)
dtrain = xgb.DMatrix(X_train, label=y_train, nthread=-1)
dtest = xgb.DMatrix(X_test, label=y_test, nthread=-1)
Model:
xg_model = XGBRegressor(objective='reg:linear', colsample_bytree= 0.3, learning_rate = 0.2,
max_depth = 5, alpha = 10, n_estimators = 100, subsample=0.4, booster = 'gbtree', n_jobs=-1)
and than if I do model training with:
xgb.train(
xg_model.get_xgb_params(),
dtrain,
num_boost_round=500,
evals=[(dtest, "Test")],
early_stopping_rounds=200)
It works ok but it uses only 1 thread to run xgboost.
Processor is on 25%. It ignores n_jobs=-1
But if I do cross-validation with scikit-learn implementation:
scores = cross_val_score(xg_model, X, y, cv=kfold, n_jobs=-1)
than it uses all cores.
How can I force xgb.train
and xgb.cv
to use all cores?