I have one error when I run the command confusionMatrix(Y_test, Y_pred, mode="everything")
:
confusionMatrix(Y_test, Y_pred, mode="everything") Error in confusionMatrix.default(Y_test, Y_pred, mode = "everything") : the data cannot have more levels than the reference
Here are the commands that I am using:
dat<-read.csv("C:/Salaries.csv")
dat$rank<-as.factor(dat$rank)
dat$discipline<-as.factor(dat$discipline)
dat$sex<-as.factor(dat$sex)
datOHE <- model.matrix(sex~.-1, dat)
train<-datOHE[1:317,]
test<-datOHE[318:nrow(datOHE),]
table(dat[1:317,]$sex)
table(dat[318:nrow(dat),]$sex)
X_train <- data.matrix(train)
Y_train <- data.matrix(dat[1:317,]$salary)
X_test <- data.matrix(test)
Y_test <- data.matrix(dat[318:nrow(dat), ]$salary)
set.seed(42)
cv.ridge <- cv.glmnet(X_train, Y_train, family='gaussian', alpha=0, type.measure='mse')
plot(cv.ridge)
Y_pred <- as.numeric(predict.glmnet(cv.ridge$glmnet.fit, newx=X_test, s=cv.ridge$lambda.min)>
.5)
Y_pred <- as.factor(Y_pred)
Y_test <- as.factor(Y_test)
confusionMatrix(Y_test, Y_pred, mode="everything")