1

I have trained the following XGboost:

model = XGBClassifier(
    objective='binary:logistic',
    base_score=0.5, 
    booster='gbtree', 
    colsample_bylevel=1,
    colsample_bynode=1, 
    colsample_bytree=1,
    enable_categorical=False, 
    gamma=2, 
    gpu_id=-1,
    importance_type=None, 
    interaction_constraints=[],
    learning_rate=0.09999999999999995, 
    max_delta_step=0,
    #max_leaves = 2,
    min_child_weight=0.9999999999999993, 
    monotone_constraints='(1,1,-1,-1,-1)',
    n_estimators=40, 
    n_jobs=1, 
    nthread=1, 
    num_parallel_tree=1,
    predictor='auto',
    random_state=0, 
    reg_alpha=0.0009765625, 
    reg_lambda=1,
    scale_pos_weight=1, 
    silent=True, 
    subsample=1,
    tree_method='exact',
    validate_parameters=1, 
    pred_contribs=True,  
    verbosity=None)

As you can see, I have commented out max_leaves = 2 (which means that, by default, max_leaves = 0). The model is trained with no issues.

Then, I have tried to train the same XGBoost BUT I have specified the max_leaves = 2:

model = XGBClassifier(
    objective='binary:logistic',
    base_score=0.5, 
    booster='gbtree', 
    colsample_bylevel=1,
    colsample_bynode=1, 
    colsample_bytree=1,
    enable_categorical=False, 
    gamma=2, 
    gpu_id=-1,
    importance_type=None, 
    interaction_constraints=[],
    learning_rate=0.09999999999999995, 
    max_delta_step=0,
    max_leaves = 2,
    min_child_weight=0.9999999999999993, 
    monotone_constraints='(1,1,-1,-1,-1)',
    n_estimators=40, 
    n_jobs=1, 
    nthread=1, 
    num_parallel_tree=1,
    predictor='auto',
    random_state=0, 
    reg_alpha=0.0009765625, 
    reg_lambda=1,
    scale_pos_weight=1, 
    silent=True, 
    subsample=1,
    tree_method='exact',
    validate_parameters=1, 
    pred_contribs=True,  
    verbosity=None)

And the model does not train. In Spyder console I see this error:

Restarting kernel...

If I try the same in Visual Studio I get an error such as Aborted!.

If I set the max_leaves to be between 1 and 11 (when the monotone_contraints are specified), then the model does not get trained. If the number of leaves is greater than 11, then the model trains fine.

Question 1. Why is that?

Then, I have removed the monotone_constraints and set the number of max_leaves between 1 and 11 and the model gets trained.

Question 2. Why would the monotone_constraints interfere with the max_leaves? I have tried this on different datasets (with low and large number of records and features), and that's what I have consistently noticed: if the monotone_contraints is set, then the max_leaves must be either 0 or greater than 11. I don't understand why. Has anyone come across this issue before?

Giampaolo Levorato
  • 1,055
  • 1
  • 8
  • 22

0 Answers0