One can use the formula()
function to generate formulas from strings in R.
Since the OP isn't reproducible, we'll demonstrate formula()
by using the mtcars
data set:
data(mtcars) # Use motor trend cars data set
dvs <- c("mpg","qsec")
ivs <- c("am","wt","disp")
for(d in dvs){
for(i in ivs){
message(paste("d is: ", d, "i is: ",i))
print(summary(lm(formula(paste(d,"~",i)),mtcars)))
}
}
...and the first part of the output:
> for(d in dvs){
+ for(i in ivs){
+ message(paste("d is: ", d, "i is: ",i))
+ print(summary(lm(formula(paste(d,"~",i)),mtcars)))
+ }
+ }
d is: mpg i is: am
Call:
lm(formula = formula(paste(d, "~", i)), data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-9.3923 -3.0923 -0.2974 3.2439 9.5077
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 17.147 1.125 15.247 1.13e-15 ***
am 7.245 1.764 4.106 0.000285 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.902 on 30 degrees of freedom
Multiple R-squared: 0.3598, Adjusted R-squared: 0.3385
F-statistic: 16.86 on 1 and 30 DF, p-value: 0.000285
Since the output from lm()
can be saved in an object, one can also generate a list()
of model objects, and manipulate them further in R.
To generate named variables for the formula()
statement from vectors containing elements of the desired variable names, one can use the paste()
or paste0()
functions in a manner similar to the approach taken above with the mtcars
data set. paste0()
defaults to no spaces between arguments, where as paste()
defaults to adding space between the arguments.
Again, making some guesses as to the actual intended formulae, we'll use the OP nested for()
loops to generate strings that can be used with formula()
in an lm()
function.
#
# generate formulas using content from OP
#
credit <- c("short_term","medium_term","long_term")
percentile <- c("p50","p75","p90")
type <- c("high4","high5","high6")
for (c in credit) {
for (p in percentile) {
for (t in type) {
aFormula <- paste0("y_",c," ~ x1-",p,"_",t," + x2 + x3")
print(aFormula)
}
}
}
...and the start of the output:
> credit <- c("short_term","medium_term","long_term")
> percentile <- c("p50","p75","p90")
> type <- c("high4","high5","high6")
>
> for (c in credit) {
+ for (p in percentile) {
+ for (t in type) {
+ aFormula <- paste0("y_",c," ~ x1_",p,"_",t," + x2 + x3")
+ print(aFormula)
+ }
+ }
+ }
[1] "y_short_term ~ x1_p50_high4 + x2 + x3"
[1] "y_short_term ~ x1_p50_high5 + x2 + x3"
[1] "y_short_term ~ x1_p50_high6 + x2 + x3"
[1] "y_short_term ~ x1_p75_high4 + x2 + x3"
[1] "y_short_term ~ x1_p75_high5 + x2 + x3"
[1] "y_short_term ~ x1_p75_high6 + x2 + x3"
[1] "y_short_term ~ x1_p90_high4 + x2 + x3"
[1] "y_short_term ~ x1_p90_high5 + x2 + x3"
[1] "y_short_term ~ x1_p90_high6 + x2 + x3"
[1] "y_medium_term ~ x1_p50_high4 + x2 + x3"
[1] "y_medium_term ~ x1_p50_high5 + x2 + x3"
[1] "y_medium_term ~ x1_p50_high6 + x2 + x3"
.
.
.
Note that the content in the OP inconsistently uses -
vs. _
, so I used _
at all relevant spots in the formulae.