0

I am trying to bootstrap lambda1 parameter in LASSO regression (using library penalized) (NOT the coefficients estimates as i KNOW that is does not make sense to calculate e.g. 95% CIs for them, this is the question about lambda1 ONLY). This is where I am so far:

df <- read.table(header=T, text="group class v1 v2 
1          Ala         1          3.98         23.2  
2          Ala         2          5.37         18.5  
3          C         1          4.73         22.1  
4          B         1          4.17         22.3  
5          C         2          4.47         22.4  
") 

Tried this:

X<-df[,c(3,4)] # data, variables in columns, cases in rows
Y<-df[,2] # dichotomous response
for (i 1:100) {
opt1<-optL1(Y,X)
opt1$lambda
}

But got Error: unexpected "}" in "}"

Tried this:

f<-function(X,Y,i){
opt1<-optL1(Y,X,[i])
}
boot(X,f,100)

But got Error in boot (X,f,100): incorrect number of subscripts on matrix... Can somebody help?

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
AussieAndy
  • 101
  • 2
  • 11
  • What specifically do you mean by "it doesn't work"? What actually happens when you run the code? Also, can you provide a sample of data intended to work with your code? See [how to make a reproducible example](http://stackoverflow.com/a/5963610/496488). – eipi10 Feb 05 '16 at 23:57
  • It's hard to answer without any data. But there is no i inside the loop. So you're overwriting opt1$lamda every time. Maybe add x[i,] and store the output (opt1$lambda) in a vector, dataframe or matrix or so. – Wave Feb 06 '16 at 00:06
  • Yes, I have tried x[i,] but ended up with Error in '[.data.frame'(X,i,) : object 'i' not found :( – AussieAndy Feb 06 '16 at 00:25

2 Answers2

1

Here is what is wrong with the for loop:

1) It needs the syntax for (i in 1:100) {} in order to work;

2) It needs to save opt1$lambda in a proper object;

3) It most likely needs the values (Y,X) to change from one iteration of the loop to another.

The R code which addresses items 1) and 2) above could be written as follows:

lambda <- NULL 
for (i in 1:100) {

    opt1 <- optL1(Y,X)  # opt1 will NOT change
                  # since Y and X are the SAME
                  # over each iteration of the for loop
    lambda <- c(lambda, opt1$lambda)

}

lambda

In this code, the object lambda which will store the value opt1$lambda produced at each iteration is declared at the top of the for loop with the command lambda -> NULL and then it is augmented after each iteration with the command lambda <- c(lambda, opt1$lambda).

In general, using the NULL trick is not recommended for a large number of iterations. A better alternative would be this:

lambda <- list('vector', 100) 
for (i in 1:100) {

  opt1 <- optL1(Y,X)  # opt1 will NOT change
                  # since Y and X are the SAME
                  # over each iteration of the for loop
  lambda[i] <- opt1$lambda

}

lambda <- unlist(lambda) 

lambda

With this second alternative, we pre-allocate lambda at the top of the for loop to be a list with 100 components, such that the i-th component will store the value opt1$lambda produced during the i-th iteration. Inside the for loop, we save the value of opt1$lambda in the list named lambda with the command:

lambda[i] <- opt1$lambda. 

At the end of the loop, we unlist lambda so that it becomes a regular vector (i.e., column of numbers).

Isabella Ghement
  • 715
  • 6
  • 14
0

You can alter the function to take in a data.frame, and specific the columns to use for response and covariate inside optL1 :

library(boot)
library(penalized)

f<-function(data,ind){
fit = optL1(data[ind,"class"],data[ind,c("v1","v2")])
fit$lambda
}

df = data.frame(group=sample(c("A","B","C"),100,replace=TRUE),
class=sample(2,100,replace=TRUE),
v1 = rnorm(100),
v2 = rnorm(100)
)

bo = boot(df,f,100)

o

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = df, statistic = f, R = 100)


Bootstrap Statistics :
    original    bias    std. error
t1* 2.887399 0.2768409     1.85466
StupidWolf
  • 45,075
  • 17
  • 40
  • 72