There are many ways to do this in the larger tidyverse. I am a fan of dplyr::rowwise
for this kind of calculations. We can use the colnames
instead of the actual data and then create a matrix like tibble
with tidyr::expand_grid
which contains all combinations of outcomes and covars. Then we can use dplyr::rowwise
and use lm
inside list()
together with reformulate
which takes strings as inputs. To get the result we can use broom::tidy
.
library(dplyr)
library(purrr)
library(tidyr)
dataset <- tibble(
y1=rnorm(n=100),
y2=rnorm(n=100),
x1=rnorm(n=100),
x2=rnorm(n=100))
outcomes <- dataset %>%
select(y1,y2) %>% colnames
covars <- dataset %>%
select(x1,x2) %>% colnames
paramlist <- expand_grid(outcomes, covars)
paramlist %>%
rowwise %>%
mutate(mod = list(lm(reformulate(outcomes, covars), data = dataset)),
res = list(broom::tidy(mod)))
#> # A tibble: 4 x 4
#> # Rowwise:
#> outcomes covars mod res
#> <chr> <chr> <list> <list>
#> 1 y1 x1 <lm> <tibble [2 x 5]>
#> 2 y1 x2 <lm> <tibble [2 x 5]>
#> 3 y2 x1 <lm> <tibble [2 x 5]>
#> 4 y2 x2 <lm> <tibble [2 x 5]>
Created on 2021-09-06 by the reprex package (v2.0.1)
We can do the same thing with {purrr} instead of dplyr::rowwise
:
paramlist %>%
mutate(mod = map2(outcomes, covars, ~ lm(reformulate(.y, .x), data = dataset)),
res = map(mod, broom::tidy))
#> # A tibble: 4 x 4
#> outcomes covars mod res
#> <chr> <chr> <list> <list>
#> 1 y1 x1 <lm> <tibble [2 x 5]>
#> 2 y1 x2 <lm> <tibble [2 x 5]>
#> 3 y2 x1 <lm> <tibble [2 x 5]>
#> 4 y2 x2 <lm> <tibble [2 x 5]>
Created on 2021-09-06 by the reprex package (v2.0.1)
Another pure {purrr} solution is to use a nested map
call. Since it is nested we need to flatten
the results before we can use map(summary)
on them.
# outcomes and covars are the same strings as above
outcomes %>%
map(~ map(covars, function(.y) lm(reformulate(.y, .x), data = dataset))) %>%
flatten %>%
map(summary)
#> [[1]]
#>
#> Call:
#> lm(formula = reformulate(.y, .x), data = dataset)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.20892 -0.56744 -0.08498 0.55445 2.10146
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.0009328 0.0923062 -0.010 0.992
#> x1 -0.0809739 0.0932059 -0.869 0.387
#>
#> Residual standard error: 0.9173 on 98 degrees of freedom
#> Multiple R-squared: 0.007643, Adjusted R-squared: -0.002483
#> F-statistic: 0.7548 on 1 and 98 DF, p-value: 0.3871
#>
#>
#> [[2]]
#>
#> Call:
#> lm(formula = reformulate(.y, .x), data = dataset)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.11442 -0.59186 -0.08153 0.61642 2.10575
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.02048 0.09461 -0.216 0.829
#> x2 -0.05159 0.10805 -0.477 0.634
#>
#> Residual standard error: 0.9197 on 98 degrees of freedom
#> Multiple R-squared: 0.002321, Adjusted R-squared: -0.007859
#> F-statistic: 0.228 on 1 and 98 DF, p-value: 0.6341
#>
#>
#> [[3]]
#>
#> Call:
#> lm(formula = reformulate(.y, .x), data = dataset)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.3535 -0.7389 -0.2023 0.6236 3.8627
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.08178 0.10659 -0.767 0.445
#> x1 -0.08476 0.10763 -0.788 0.433
#>
#> Residual standard error: 1.059 on 98 degrees of freedom
#> Multiple R-squared: 0.006289, Adjusted R-squared: -0.003851
#> F-statistic: 0.6202 on 1 and 98 DF, p-value: 0.4329
#>
#>
#> [[4]]
#>
#> Call:
#> lm(formula = reformulate(.y, .x), data = dataset)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -2.4867 -0.7020 -0.1935 0.5869 3.7574
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.06575 0.10875 -0.605 0.547
#> x2 0.12388 0.12420 0.997 0.321
#>
#> Residual standard error: 1.057 on 98 degrees of freedom
#> Multiple R-squared: 0.01005, Adjusted R-squared: -5.162e-05
#> F-statistic: 0.9949 on 1 and 98 DF, p-value: 0.321
Created on 2021-09-06 by the reprex package (v2.0.1)