Error ValueError: continuous is not supported when tried to use cross_val_score and Linear Regression

Question

I have used load_boston dataset from sklearn and Linear Regression. The code:

from sklearn.datasets import load_boston
import pandas as pd
import numpy as np
%matplotlib inline
from sklearn.model_selection import train_test_split, KFold,cross_val_score,cross_validate
from sklearn.linear_model import LinearRegression

#Loading the dataset
x = load_boston()
df = pd.DataFrame(x.data, columns = x.feature_names)
df["MEDV"] = x.target
X = df.drop("MEDV",1)   #Feature Matrix
y = df["MEDV"]          #Target Variable
df.head()

linear = LinearRegression()
X_train,X_test, y_train,y_test = train_test_split(X,y, random_state = 11)
linear.fit(X_train,y_train)

kfold = KFold(n_splits=5, random_state=11, shuffle=True)
scores = cross_val_score(estimator= linear,cv=kfold, X=X, y = y, )# if scoring= "accuracy": error

#>ValueError: continuous is not supported

print(f"Mean Accuracy: {scores.mean():.2%} and standard deviation: {scores.std():.2%}")

If I use scoring= "accuracy" in the cross_val_score, it rises a error:

ValueError: continuous is not supported

What is happening?

Accuracy is a classification metric, and it is just meaningless in regression settings, hence the error; see similar situation here: https://stackoverflow.com/questions/38015181/accuracy-score-valueerror-cant-handle-mix-of-binary-and-continuous-target — desertnaut, Apr 21 '20 at 14:01
You are using `LinearRegression()` in your code. I'm not sure how you can say you are doing classification. — Mihai Chelaru, Apr 21 '20 at 14:07

yatu · Accepted Answer · 2020-04-21T14:26:21.370

The accuracy does not work here since it is a metric aimed at classification problems. Namely it is:

Number of correct predictions / Total number of predictions

By not setting it it works fine, since it defaults to the underlying estimator's scoring, which is the R^2 score for a LinearRegression, which is a scoring you should be looking at for a regression problem.

You can have a look at the different scoring types supported in sklearn and for what problems they are appropriate:

Metrics and scoring: quantifying the quality of predictions

Error ValueError: continuous is not supported when tried to use cross_val_score and Linear Regression

1 Answers1