I have a multi-class problem. I tried to calculate the ROC-AUC score using the function metrics.roc_auc_score()
from sklearn
. This function has support for multi-class but it needs the estimated probabilities, for that the classifier needs to have the method predict_proba()
(which svm.LinearSVC()
does not have).
Here is an example of what I trying to do:
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
# Get the data
iris = datasets.load_iris()
X, y = iris.data, iris.target
# Create the model
clf = SVC(kernel='linear', probability=True)
# Split the data in train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0)
# Train the model
clf.fit(X_train, y_train)
# Predict the test data
predicted = clf.predict(X_test)
predicted_proba = clf.predict_proba(X_test)
roc_auc = roc_auc_score(y_test, predicted_proba, multi_class='ovr')
I tried to use svm.SVC()
with a linear kernel and the parameter probability
set it to True
. This allows me to use the method predict_proba()
from this function. The problem is takes a long time to finish compared to LinearSVC()
when you have a big dataset (the example is really quit because is a small amount of samples). Is there a way to use LinearSVC()
and roc_auc_score()
for a multi-class problem?