Decision_function for XGBoost in SKLearn wrapper

Question

I get different results for model.predict_proba(X)[:,0] compared to model.decision_function(X)for a regular Grad Boost Decision Tree classifier in SKLearn so I know that that is not the same.

I want the scores of the model. To plot ROC curves etc. How can I get the decision function for XGBoost classifier using the SKLearn wrapper? And why is predict_proba different from scores?

why should they be the same? Have you used the exact same algorithm with the exact same hyperparamters and random seed? — 00__00__00, Apr 12 '18 at 14:40
yes its the same model trained once. I'm asking what the difference between the two functions is. — user7867665, Apr 13 '18 at 14:57
what is model.decision_function(X) ? Do you have an API/implementation doc for this? — Eran Moshe, Apr 15 '18 at 10:57
It gives you the output of the classifier before a threshold is decided. It varies from algorithm to algorithm. For SVM it is the distance to the decision hyperplane (https://stackoverflow.com/questions/20113206/scikit-learn-svc-decision-function-and-predict) — user7867665, Apr 16 '18 at 15:20

score 1 · Answer 1 · answered May 02 '18 at 15:40

In general, I would not expect the sklearn.GradientBoostingClassifier and xgboost.XGBClassifier to agree, as those use very different implementations. But there are also conceptual difference between the quantities that you have tried to compare:

And why is predict_proba different from scores?

Probabilities (output of model.predict_proba(X)) are obtained from the scores (output of model.decision_function(X)) applying the loss/objective function, see here for the call to the loss function and here for the actual transformation.

I want the scores of the model. To plot ROC curves etc. How can I get the decision function for XGBoost classifier using the SKLearn wrapper?

For the ROC curve you will want to use xgbmodel.predict_proba(X)[:,1], i.e. the second column that correspond to the class 1.

Decision_function for XGBoost in SKLearn wrapper

1 Answers1