4

i have an imbalanced multiclass dataset , when i try to compute the roc_auc_score i get this error: ValueError: Number of classes in y_true not equal to the number of columns in 'y_score'.

here is the code:

model = svm.SVC(kernel='linear', probability=True)
model.fit(X_train, y_train)
y_prob = model.predict_proba(X_test)
macro_roc_auc_ovr = roc_auc_score(y_test, y_prob, multi_class="ovr",
                              average="macro")

Any suggestions to solve this problem.

Thank you

nesrine-bn
  • 41
  • 1
  • 3

1 Answers1

1

This problem happened to me when I did not have at least one example of the class inside each fold. To solve I replaced KFold for StratifiedKFold.

I think it may be a problem with your split method. You can ask the split to be stratified as well passing stratify=y to the train_test_split method (if that's what you're using)

  • 5
    This is a patch on probably a bug. Imagine that we can't do stratifiedKFold ( time series setting), here, if you simply have imbalanced classes, there's no workaround... – An old man in the sea. Nov 19 '21 at 21:42