Why do my CatBoost fit metrics are different than the sklearn evaluation metrics?

Question

I'm still not sure this should be a question for this forum or for Cross-Validated, but I'll try this one, since it's more about the output of the code than the technique per se. Here's the thing, I'm running a CatBoost Classifier, just like this:

# import libraries
import pandas as pd
from catboost import CatBoostClassifier
from sklearn.metrics import accuracy_score, recall_score, precision_score, f1_score
from sklearn.model_selection import train_test_split    

# import data
train = pd.read_csv("train.csv")

# get features and label
X = train[["Pclass", "Sex", "SibSp", "Parch", "Fare"]]

y = train[["Survived"]]

# split into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# model parameters 
model_cb = CatBoostClassifier(
    cat_features=["Pclass", "Sex"],
    loss_function="Logloss",
    eval_metric="AUC",
    learning_rate=0.1,
    iterations=500,
    od_type = "Iter",
    od_wait = 200
)

# fit model
model_cb.fit(
    X_train,
    y_train,
    plot=True,
    eval_set=(X_test, y_test),
    verbose=50,
)

y_pred = model_cb.predict(X_test)

print(f1_score(y_test, y_pred, average="macro"))

print(roc_auc_score(y_test, y_pred))

The dataframe I'm using is from the Titanic competition (link).

The problem is that the model_cb.fit step is showing an AUC of 0.87, but the last line, the roc_auc_score from sklearn is showing me an AUC of 0.73, i.e., a much lower. The AUC from CatBoost, from what I understood is supposedly already on the testing dataset.

Any ideas on which is the problem here and how could I fix it?

score 2 · Accepted Answer · answered Mar 25 '21 at 18:45

2

The ROC curve needs predicted probabilities or some other sort of confidence measure, not hard class predictions. Use

y_pred = model_cb.predict_proba(X_test)[:, 1]

See Scikit-learn : roc_auc_score and Why does roc_curve return only 3 values?.

answered Mar 25 '21 at 18:45

Ben Reiniger

10,517
3
16
29

Why do my CatBoost fit metrics are different than the sklearn evaluation metrics?

1 Answers1