I am trying to fit a ridge
regression model on the white wine dataset. I want to use the entire dataset for training and use 10 fold CV for calculating the test error rate. Thats the main question - how to calculate CV test error for a ridge regressed logistic model. I calculated the best value of lambda
(also using CV
), and now I want to find the CV
test error rate. Currently, my code for calculating the said CV test error is -
cost1 <- function(good, pi=0) mean(abs(good-pi) > 0.5)
ridge.model <- glmnet(x, y, alpha = 0, family = "binomial", lambda = bestlam)
ridge.model$beta # all coefficients for the variables
ridge.model.cv.err<- cv.glm(winedata,ridge.model,cost1, K=10)
ridge.model.cv.err$delta
This gives the following error -
Error in cbind2(1, newx) %% nbeta : not-yet-implemented method for %%
Any ideas what could be causing this error?
It was suggested that I should use cv.glmnet
instead. However, it doesn't seem like it accepts the model type (that would be logistic here) as input, plus it needs a list of lambda values as input, whereas I just have one best lambda value that I got as mentioned above. So running the code -
ridge.model.cv.err<- cv.glmnet(x,y, lambda = bestlam, cost1, K=10)
gives the error - Error in cv.glmnet(x, y, lambda = bestlam, cost1, K = 10) : Need more than one value of lambda for cv.glmnet
The data were processed as -
winedata <- read.delim("winequality-white.csv", sep = ';')
winedata$quality[winedata$quality< 7] <- "0" #recode
winedata$quality[winedata$quality>=7] <- "1" #recode
winedata$quality <- factor(winedata$quality)# Convert the column to a factor
names(winedata)[names(winedata) == "quality"] <- "good" #rename 'quality' to 'good'
Appreciate your help.