glmnet training throws error on x,y dataframe arguments: am I using it wrong?

Question

I'm trying to learn a penalized logistic regression method with glmnet. I'm trying to predict if a car from the mtcars example data will have an automatic transmission or manual. I think my code is pretty straightforward, but I seem to be getting an error:

This first block simply splits mtcars into an 80% train set and a 20% test set

library(glmnet)
attach(mtcars)

smp_size <- floor(0.8 * nrow(mtcars))

set.seed(123)
train_ind <- sample(seq_len(nrow(mtcars)), size=smp_size)

train <- mtcars[train_ind,]
test <- mtcars[-train_ind,]

I know the x data is supposed to be in a matrix form without the response, so I separate the two training sets into a non-response matrix (train_x) and a response vector (train_y)

train_x <- train[,!(names(train) %in% c("am"))]
train_y <- train$am

But when trying to train the model,

p1 <- glmnet(train_x, train_y)

I get the error:

Error in elnet(x, is.sparse, ix, jx, y, weights, offset, type.gaussian,
:(list) object cannot be coerced to type 'double'

Am I missing something?

Don't `attach`. It's a bad habit that can get you into trouble, and you're not even using it in the code you share! — Gregor Thomas, Jun 08 '15 at 19:15
So....I'm not sure what you think you've done to actually convert your covariates to be a matrix. A matrix is a different data structure. You can't get one just by subsetting a data frame. — joran, Jun 08 '15 at 19:18
Have you looked at `cv.glmnet`? It does automatic k-fold CV. — Vlo, Jun 08 '15 at 19:19

score 5 · Answer 1 · answered Jun 08 '15 at 19:18

Coercing the first argument as a matrix solve for me :

p1 <- glmnet(as.matrix(train_x), train_y)

In fact , form glmnet? looks that the first argument should be a matrix/sparse matrix:

x: input matrix, of dimension nobs x nvars; each row is an observation vector. Can be in sparse matrix format (inherit from class "sparseMatrix" as in package Matrix; not yet available for family="cox")

glmnet training throws error on x,y dataframe arguments: am I using it wrong?

1 Answers1

Linked