How to get the intercept from a linear model with lasso (lars R package)

Question

I am having an hard time in getting the model estimated by the R package lars for my data.

For example I create a fake dataset x and corresponding values y like this:

x = cbind(runif(100),rnorm(100))
colnames(x) = c("a","b")
y = 0.5 + 3 * x[,1,drop = FALSE]

Next I train a model that uses lasso regularization using the lars function:

m = lars(x,y,type = "lasso", normalize = FALSE, intercept = TRUE)

Now I would like to know what is the estimated model (that I know to be: y = 0.5 + 3 * x[,1] + 0 * x[,2])

I am only interested in the coefficients obtained in the last step:

cf = predict(m, x, s=1, mode = "fraction", type = "coef")$coef
cf
a b 
3 0

These are the coefficients that I expect, but I can't find a way to get the intercept (0.5) from m.

I have tried to check the code of predict.lars, where the fit is done as such:

fit = drop(scale(newx, 
           object$meanx, FALSE) %*% t(newbetas)) + object$mu)

I can see that the variables are scaled, and that the mean of y (object$mu) is used, but I can't find an easy way to obtain the value of the intercept I am looking for. How can I get that?

Hi, you can replace `x` with `cbind(1,x)` to add a column of ones and use the option `intercept=FALSE`. — Stéphane Laurent, Aug 14 '14 at 16:06
... but it is not a good idea because lasso could set the intercept at 0 — Stéphane Laurent, Aug 17 '14 at 16:50

score 7 · Accepted Answer · answered Jan 30 '14 at 21:23

7

intercept=T in lars has the effect of centering the x variables and y variable. It doesn't include an explicit intercept term with a coefficient.

That being said, you could do predict(m,data.frame(a=0,b=0),s=2)$fit to get the predicted value of y when the covariates are 0 (the definition of a traditional intercept)

answered Jan 30 '14 at 21:23

Jeremy Coyle

476
3
5

1

thanks, I was looking for a way to access them from the data structure, but it doesn't seam to be possible.. Another way I have found, is using the fact that you have fit the model (y - ym) = b1*(x1 - x1m) + b2*(x2 - x2m), so that the intercept in terms of your un-centred variables is y = (y -b1*xm1 - b2*xm2) where the m denotes the mean of the variables – lucacerone Jan 30 '14 at 21:48

How to get the intercept from a linear model with lasso (lars R package)

1 Answers1