What is the correct way to calibrate probabilities when you have multiclass problem?

Question

I am training a model to predict the label (target) based on loan status e.g. 0,1,2,3. So i have 4 classes. I have so far trained a model as follows:

  from HyperclassifierSearch import HyperclassifierSearch

X = data.iloc[:, :-1]
y = data.label

    
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.2, 
random_state=42)
# Create a hold out dataset to train the calibrated model to prevent overfitting
X_train, X_validation, y_train, y_validation = train_test_split(X_train, y_train, 
stratify=y_train, test_size=0.2, random_state=42)
categorical_transformer = OneHotEncoder(handle_unknown='ignore')
              
numeric_transformer = Pipeline(steps=[('imputer',SimpleImputer(missing_values=np.nan, fill_value=0) ),('scaler', StandardScaler())])
   
preprocessor = ColumnTransformer(transformers=[('num', numeric_transformer, numeric_cols),
                                    ('cat', categorical_transformer, cat_cols)])


#then i use hyperclassifer library 

models = {  'xgb': Pipeline(steps=[('preprocessor', preprocessor),('clf', XGBClassifier(objective='multi:softprob'))]),
                       'rf': Pipeline(steps=[('preprocessor', preprocessor),('clf', RandomForestClassifier(criterion = 'entropy', random_state = 42))]) }


search = HyperclassifierSearch(models, params)
best_grid = search.train_model(X_train, y_train, cv=3, n_jobs=-1, scoring='accuracy')
results = search.evaluate_model()
fitted_model = best_grid.best_estimator_

pred = fitted_model.predict_proba(X_test)
labels = fitted_model.predict(X_test)

**note i have omitted alot of the imported libs and params dict since large and only included hyperclassifier since it is large **

my pred is a matrix containing 4 columns where each is related to the class of the loan. Generally i know it is good pracitce to calibrate the probabilities and particularly from tree based algorithms the output is a score not really a probability. I am however confused as to how to calibrate these probabilities.

Usually i would calibrate using the holdout validation set but am unsure how to do it with multiclass

Update

Should i ammend the above xgbclassifier by doing the following:

OneVsRestClassifier(CalibratedClassifierCV(XGBClassifier(objective='multi:softprob'), cv=10))

source: Multiclass linear SVM in python that return probability

My question is what is the correct way to calibrate probabilities from multiclass model?

"the correct way" seems to me to be a statistical question, which would belong better over at stats.SE or datascience.SE — Ben Reiniger, Sep 07 '21 at 16:34
https://stats.stackexchange.com/questions/543897/how-to-calibrate-with-multiclass-classification-problem — Maths12, Sep 07 '21 at 19:59

What is the correct way to calibrate probabilities when you have multiclass problem?

Usually i would calibrate using the holdout validation set but am unsure how to do it with multiclass

Update

My question is what is the correct way to calibrate probabilities from multiclass model?

0 Answers0