In general, I would not expect the sklearn.GradientBoostingClassifier
and xgboost.XGBClassifier
to agree, as those use very different implementations. But there are also conceptual difference between the quantities that you have tried to compare:
And why is predict_proba different from scores?
Probabilities (output of model.predict_proba(X)
) are obtained from the scores (output of model.decision_function(X)
) applying the loss/objective function, see here for the call to the loss function and here for the actual transformation.
I want the scores of the model. To plot ROC curves etc. How can I get the decision function for XGBoost classifier using the SKLearn wrapper?
For the ROC curve you will want to use xgbmodel.predict_proba(X)[:,1]
, i.e. the second column that correspond to the class 1
.