15

I searched Google, and saw a couple of StackOverflow posts about this error. They are not my cases.

I use keras to train a simple neural network and make some predictions on the splitted test dataset. But when use roc_auc_score to calculate AUC, I got the following error:

"ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.".

I inspect the target label distribution, and they are highly imbalanced. Some labels(in the total 29 labels) have only 1 instance. So it's likely they will have no positive label instance in the test label. So the sklearn's roc_auc_score function reported the only one class problem. That's reasonable.

But I'm curious, as when I use sklearn's cross_val_score function, it can handle the AUC calculation without error.

my_metric = 'roc_auc' 
scores = cross_validation.cross_val_score(myestimator, data,
                                   labels, cv=5,scoring=my_metric)

I wonder what happened in the cross_val_score, is it because the cross_val_score use a stratified cross-validation data split?


UPDATE
I continued to make some digging, but still can't find the difference behind.I see that cross_val_score call check_scoring(estimator, scoring=None, allow_none=False) to return a scorer, and the check_scoring will call get_scorer(scoring) which will return scorer=SCORERS[scoring]

And the SCORERS['roc_auc'] is roc_auc_scorer
the roc_auc_scorer is made by

roc_auc_scorer = make_scorer(roc_auc_score, greater_is_better=True,
                                 needs_threshold=True)

So, it's still using the roc_auc_score function. I don't get why cross_val_score behave differently with directly calling roc_auc_score.

Nazim Kerimbekov
  • 4,712
  • 8
  • 34
  • 58
Allan Ruin
  • 5,229
  • 7
  • 37
  • 42
  • what is `my_metric`? – maxymoo Aug 19 '16 at 04:24
  • @maxymoo I use the string `roc_auc`, it is a valid value. – Allan Ruin Aug 19 '16 at 04:32
  • If you do cross validation and you have too few of one kind of label, some folds may be devoid of any such labels. Try decreasing the number of folds and make sure you use stratified sampling. – Kris Aug 19 '16 at 19:29
  • I have encountered precisely the same issue and I think the problem is actually with the cross_validation split! ''' > /usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py(131)cross_val_score() 129 130 cv = check_cv(cv, y, classifier=is_classifier(estimator)) --> 131 cv_iter = list(cv.split(X, y, groups)) yields the following cv_iter: ipdb> np.count_nonzero(y[cv_iter[0][1],1]) 2 ipdb> np.count_nonzero(y[cv_iter[1][1],1]) 2 ipdb> np.count_nonzero(y[cv_iter[2][1],1]) 0 no negative e.g. for the 3rd split – layser Mar 27 '17 at 14:09

1 Answers1

3

I think your hunch is correct. The AUC (area under ROC curve) needs a sufficient number of either classes in order to make sense.

By default, cross_val_score calculates the performance metric one each fold separately. Another option could be to do cross_val_predict and compute the AUC over all folds combined.

You could do something like:

from sklearn.metrics import roc_auc_score
from sklearn.cross_validation import cross_val_predict
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification


class ProbaEstimator(LogisticRegression):
    """
    This little hack needed, because `cross_val_predict`
    uses `estimator.predict(X)` internally.

    Replace `LogisticRegression` with whatever classifier you like.

    """
    def predict(self, X):
        return super(self.__class__, self).predict_proba(X)[:, 1]


# some example data
X, y = make_classification()

# define your estimator
estimator = ProbaEstimator()

# get predictions
pred = cross_val_predict(estimator, X, y, cv=5)

# compute AUC score
roc_auc_score(y, pred)
Kris
  • 22,079
  • 3
  • 30
  • 35
  • 3
    Almost sure you have resolved your issue, but I faced the same problem recently and turns out the reason is simple: cross_val_score by default uses _shuffling_ cross-validation, which has a theoretically nonzero chance of delivering enough amount of both classes for the auc computation. I don't know which splitter did you use, but it could be something like TimeSeriesSplit. Which is _not_ shuffling by definition. Henceforth for imbalanced datasets, this split _can_ produce all instances of the same class for some specific fold. – Vast Academician Aug 10 '19 at 21:22