How should I get the coefficients of Lasso Model?

Question

Here is my code:

library(MASS)
library(caret)
df <- Boston
set.seed(3721)
cv.10.folds <- createFolds(df$medv, k = 10)
lasso_grid <- expand.grid(fraction=c(1,0.1,0.01,0.001))
lasso <- train(medv ~ ., 
               data = df, 
               preProcess = c("center", "scale"),
               method ='lasso',
               tuneGrid = lasso_grid,
               trControl= trainControl(method = "cv", 
                                       number = 10, 
                                       index = cv.10.folds))  

lasso

Unlike linear model, I cannot find the coefficients of Lasso regression model from summary(lasso). How should I do that? Or maybe I can use glmnet?

You can extract `finalModel` out of the object, upon which you can then use `elasticnet` functions, e.g. `predict(lasso$finalModel, type = 'coefficients')`, or just dig deeper into the object: `lasso$finalModel$beta.pure`. There may be a way more in keeping with the caret API, but I'm not sure what it would be. — alistaire, Jan 23 '17 at 01:59
I think it's performing stepwise selection by varying the shrinkage parameter. There are also Cp values in there you could use to pick the best model. I'm not an elastic net expert, though, so I'm sure if you ask that question at [CrossValidated](http://stats.stackexchange.com/) you'll get a better answer. — alistaire, Jan 23 '17 at 03:26
@KAICHENGWANG I'm having the same issue. I have also posted a similiar question here: (https://stackoverflow.com/questions/48079660/extract-the-coefficients-for-the-best-tuning-parameters-in-caret). Did you ever find out how to extract the coefficients corresponding to the best tuning parameters? — pd441, Jan 05 '18 at 11:14

score 2 · Answer 1 · answered Jun 14 '20 at 21:14

When you train with method="lasso", enet from elasticnet is called:

lasso$finalModel$call
elasticnet::enet(x = as.matrix(x), y = y, lambda = 0)

And the vignette writes:

The LARS-EN algorithm computes the complete elastic net solution simultaneously for ALL values of the shrinkage parameter in the same computational cost as a least squares fit

Under lasso$finalModel$beta.pure, you have coefficients for all 16 sets of coefficients corresponding to 16 values of L1 norm under lasso$finalModel$L1norm:

length(lasso$finalModel$L1norm)
[1] 16

dim(lasso$finalModel$beta.pure)
[1] 16 13

You can see it using predict too:

predict(lasso$finalModel,type="coef")
$s
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16

$fraction
 [1] 0.00000000 0.06666667 0.13333333 0.20000000 0.26666667 0.33333333
 [7] 0.40000000 0.46666667 0.53333333 0.60000000 0.66666667 0.73333333
[13] 0.80000000 0.86666667 0.93333333 1.00000000

$mode
[1] "step"

$coefficients
          crim        zn       indus      chas        nox       rm        age
0   0.00000000 0.0000000  0.00000000 0.0000000  0.0000000 0.000000 0.00000000
1   0.00000000 0.0000000  0.00000000 0.0000000  0.0000000 0.000000 0.00000000
2   0.00000000 0.0000000  0.00000000 0.0000000  0.0000000 1.677765 0.00000000
3   0.00000000 0.0000000  0.00000000 0.0000000  0.0000000 2.571071 0.00000000
4   0.00000000 0.0000000  0.00000000 0.0000000  0.0000000 2.716138 0.00000000
5   0.00000000 0.0000000  0.00000000 0.2586083  0.0000000 2.885615 0.00000000
6  -0.05232643 0.0000000  0.00000000 0.3543411  0.0000000 2.953605 0.00000000
7  -0.13286554 0.0000000  0.00000000 0.4095229  0.0000000 2.984026 0.00000000
8  -0.21665925 0.0000000  0.00000000 0.5196189 -0.5933941 3.003512 0.00000000
9  -0.32168140 0.3326103  0.00000000 0.6044308 -1.0246080 2.973693 0.00000000
10 -0.33568474 0.3771889 -0.02165730 0.6165190 -1.0728128 2.967696 0.00000000
11 -0.42820289 0.4522827 -0.09212253 0.6407298 -1.2474934 2.932427 0.00000000
12 -0.62605363 0.7005114  0.00000000 0.6574277 -1.5655601 2.832726 0.00000000
13 -0.88747102 1.0150162  0.00000000 0.6856705 -1.9476465 2.694820 0.00000000
14 -0.91679342 1.0613165  0.09956489 0.6837833 -2.0217269 2.684401 0.00000000
15 -0.92906457 1.0826390  0.14103943 0.6824144 -2.0587536 2.676877 0.01948534

The hyper-parameter tuned by caret is the fraction of the maximum L1 norm, so in the result you have provided, it will be 1, i.e the max :

lasso
The lasso 

506 samples
 13 predictor

Pre-processing: centered (13), scaled (13) 
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 51, 51, 51, 50, 51, 50, ... 
Resampling results across tuning parameters:

  fraction  RMSE      Rsquared   MAE     
  0.001     9.182599  0.5075081  6.646013
  0.010     9.022117  0.5075081  6.520153
  0.100     7.597607  0.5572499  5.402851
  1.000     6.158513  0.6033310  4.140362

RMSE was used to select the optimal model using the smallest value.
The final value used for the model was fraction = 1.

To get the coefficients out for the optimal fraction:

predict(lasso$finalModel,type="coef",s=16)
$s
[1] 16

$fraction
[1] 1

$mode
[1] "step"

$coefficients
       crim          zn       indus        chas         nox          rm 
-0.92906457  1.08263896  0.14103943  0.68241438 -2.05875361  2.67687661 
        age         dis         rad         tax     ptratio       black 
 0.01948534 -3.10711605  2.66485220 -2.07883689 -2.06264585  0.85010886 
      lstat 
-3.74733185

score 0 · Answer 2 · answered Sep 11 '20 at 12:01

I noticed there can be issues using the approach above, if one defines their own grid for hyperparameter tuning. Predict.enet appears to impose its own grid, which often does not correspond to the grid one defined for train().

If this is the case, one can set the "mode"-argument to "fraction" and provide a vector of fractions from the train()-output to the "s"-argument:

predict(lasso$finalModel, type = "coef", mode = "fraction", s = lasso$bestTune)

"S" can also be just your optimal tuning parameter, determined with train():

predict(lasso$finalModel, type = "coef", mode = "fraction", s = as.numeric(lasso$bestTune))

^{Created on 2020-09-11 by the reprex package (v0.3.0)}

How should I get the coefficients of Lasso Model?

2 Answers2

Linked