Here is an approach with lapply()
, using the mtcars
data set. We will selectmpg
as the dependent variable, extract the remaining columns from the data set, and then use lapply()
to run regression models on each element in the indepVars
vector. The output from each model is saved to a list, including the name of the independent variable as well as the resulting model object.
indepVars <- names(mtcars)[!(names(mtcars) %in% "mpg")]
modelList <- lapply(indepVars,function(x){
result <- lm(mpg ~ mtcars[[x]],data=mtcars)
list(variable=x,model=result)
})
# print the first model
modelList[[1]]$variable
summary(modelList[[1]]$model)
The extract operator [[
can then be used to print the content of any of the models.
...and the output:
> # print the first model
> modelList[[1]]$variable
[1] "cyl"
> summary(modelList[[1]]$model)
Call:
lm(formula = mpg ~ mtcars[[x]], data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-4.9814 -2.1185 0.2217 1.0717 7.5186
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.8846 2.0738 18.27 < 2e-16 ***
mtcars[[x]] -2.8758 0.3224 -8.92 6.11e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.206 on 30 degrees of freedom
Multiple R-squared: 0.7262, Adjusted R-squared: 0.7171
F-statistic: 79.56 on 1 and 30 DF, p-value: 6.113e-10
>
Responding to the comment from the original poster, here is the code necessary to encapsulate the above process within an R function. The function regList()
takes a data frame name and a dependent variable string, and then proceeds to run regressions of the dependent variable on each of the remaining variables in the data frame passed to the function.
regList <- function(dataframe,depVar) {
indepVars <- names(dataframe)[!(names(dataframe) %in% depVar)]
modelList <- lapply(indepVars,function(x){
message("x is: ",x)
result <- lm(dataframe[[depVar]] ~ dataframe[[x]],data=dataframe)
list(variable=x,model=result)
})
modelList
}
modelList <- regList(mtcars,"mpg")
# print the first model
modelList[[1]]$variable
summary(modelList[[1]]$model)
One can extract a variety of content from the individual model objects. The output is as follows:
> modelList <- regList(mtcars,"mpg")
> # print the first model
> modelList[[1]]$variable
[1] "cyl"
> summary(modelList[[1]]$model)
Call:
lm(formula = dataframe[[depVar]] ~ dataframe[[x]], data = dataframe)
Residuals:
Min 1Q Median 3Q Max
-4.9814 -2.1185 0.2217 1.0717 7.5186
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.8846 2.0738 18.27 < 2e-16 ***
dataframe[[x]] -2.8758 0.3224 -8.92 6.11e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.206 on 30 degrees of freedom
Multiple R-squared: 0.7262, Adjusted R-squared: 0.7171
F-statistic: 79.56 on 1 and 30 DF, p-value: 6.113e-10
>