I have trained the following XGboost:
model = XGBClassifier(
objective='binary:logistic',
base_score=0.5,
booster='gbtree',
colsample_bylevel=1,
colsample_bynode=1,
colsample_bytree=1,
enable_categorical=False,
gamma=2,
gpu_id=-1,
importance_type=None,
interaction_constraints=[],
learning_rate=0.09999999999999995,
max_delta_step=0,
#max_leaves = 2,
min_child_weight=0.9999999999999993,
monotone_constraints='(1,1,-1,-1,-1)',
n_estimators=40,
n_jobs=1,
nthread=1,
num_parallel_tree=1,
predictor='auto',
random_state=0,
reg_alpha=0.0009765625,
reg_lambda=1,
scale_pos_weight=1,
silent=True,
subsample=1,
tree_method='exact',
validate_parameters=1,
pred_contribs=True,
verbosity=None)
As you can see, I have commented out max_leaves = 2
(which means that, by default, max_leaves = 0
). The model is trained with no issues.
Then, I have tried to train the same XGBoost
BUT I have specified the max_leaves = 2
:
model = XGBClassifier(
objective='binary:logistic',
base_score=0.5,
booster='gbtree',
colsample_bylevel=1,
colsample_bynode=1,
colsample_bytree=1,
enable_categorical=False,
gamma=2,
gpu_id=-1,
importance_type=None,
interaction_constraints=[],
learning_rate=0.09999999999999995,
max_delta_step=0,
max_leaves = 2,
min_child_weight=0.9999999999999993,
monotone_constraints='(1,1,-1,-1,-1)',
n_estimators=40,
n_jobs=1,
nthread=1,
num_parallel_tree=1,
predictor='auto',
random_state=0,
reg_alpha=0.0009765625,
reg_lambda=1,
scale_pos_weight=1,
silent=True,
subsample=1,
tree_method='exact',
validate_parameters=1,
pred_contribs=True,
verbosity=None)
And the model does not train. In Spyder console I see this error:
Restarting kernel...
If I try the same in Visual Studio I get an error such as Aborted!
.
If I set the max_leaves
to be between 1 and 11 (when the monotone_contraints
are specified), then the model does not get trained. If the number of leaves is greater than 11, then the model trains fine.
Question 1. Why is that?
Then, I have removed the monotone_constraints
and set the number of max_leaves
between 1 and 11 and the model gets trained.
Question 2. Why would the monotone_constraints
interfere with the max_leaves
? I have tried this on different datasets (with low and large number of records and features), and that's what I have consistently noticed: if the monotone_contraints
is set, then the max_leaves
must be either 0 or greater than 11. I don't understand why. Has anyone come across this issue before?