0

Instead of specifying the predictors as regression arguments, I would like to just pass on a string and change it into the right syntax before it is used for penalized regression. It has been explained in Loop function to add large numbers of predictors in regression function or How to use paste to get formula how this can be done for lm but it does not work for penalized regression.

Here is my code:

df<-data.frame(date=seq(as.Date("2018-01-01"), as.Date("2018-10-01"), by="days"))
df$month<-format(as.Date(df$date), "%m")
df$y<-runif(nrow(df),1,100)
df$time<- -floor(nrow(df)/2):(ceiling(nrow(df)/2)-1)/1000
df$month<-as.factor(df$month)

yname<-"y"
xnames<-colnames(training)
xnames<-xnames[-which(xnames==yname)]
xnames<-xnames[-which(xnames=="date")]
yname<-paste(yname,",")
formula<-paste(yname,"~",paste(xnames,collapse="+"))

ens<-penalized(formula, ~ 0,lambda1=1, lambda2=1, positive =TRUE, data=training)

I tried using as.formula for the formula but it does not work with the comma. It is all working fine though if I put the variable names in manually and it is working for lm but not for penalized. Any ideas?

Please note that I have edited the question to make it more specific for penalized.

Mac
  • 63
  • 1
  • 6
  • 1
    Possible duplicate of [Pass a vector of variables into lm() formula](https://stackoverflow.com/questions/9238038/pass-a-vector-of-variables-into-lm-formula) – Ritchie Sacramento Oct 02 '19 at 11:42

1 Answers1

0

You are almost there, you just need to use the 'paste' function:

string <- paste("month", "time", sep = " + ")

I am not familiar with the 'penalized' function. However, if you are having further issues you might need to paste the entire formula. For example, for a linear regression, you would need to use something like:

string_variables <- paste("month", "time", sep = " + ")
string_formula <- paste("y ~ ", string_variables ,sep = " ")

# simple linear regression usage
ens <- lm(formula = string_formula, data=training)
summary(ens)
stewart
  • 31
  • 4
  • Your suggestion works in your setting. However, how do I change it so that I can just specify the vector s=c("month","time") instead of the two strings (in fact I've got a much longer vector and don't want to use a for loop)? string_variables<-paste(string,sep="+") doesn't work, maybe lapply? Moreover, I tried to use string_formula <- paste("y~ ,", string_variables ,sep = " ") for penalized but this does not work although I don't see the difference to lm except of course for the slightly different syntax. Any ideas? – Mac Oct 03 '19 at 10:18
  • @Mac: The `lm` function apparently silently coerces strings to R formulas which are language elements. I'm guessing that `penalized` would ahve worked if you had used `as.formula` to do the coercion. – IRTFM Jul 09 '22 at 23:39