We have modified the input slightly to make the grouping columns factors and to provide more data per group in the Note at the end.
1) lmList Then we use lmList
from nlme which can do regression by groups. We have used pool=FALSE
but you can omit it and use the default of pool=TRUE
depending on what you want. See ?lmList
for details.
library(nlme)
fm <- lmList(y ~ x | g, transform(df, g = strata:category), pool = FALSE)
summary(fm)
giving:
Call:
Model: y ~ x | g
Data: print(transform(df, g = strata:category))
Coefficients:
(Intercept)
Estimate Std. Error t value Pr(>|t|)
1:A -0.6508622 0.7185126 -0.9058467 0.53142497
1:B -1.8043171 1.5367930 -1.1740794 0.44913412
2:A 4.0963651 1.4952687 2.7395512 0.22281400
2:B 3.5787230 0.1888514 18.9499422 0.03356368
x
Estimate Std. Error t value Pr(>|t|)
1:A -0.03320257 0.21035895 -0.1578377 0.9003396
1:B 0.33526295 0.35569847 0.9425482 0.5188229
2:A -0.45361115 0.16347186 -2.7748577 0.2202010
2:B -0.35620163 0.01863826 -19.1113091 0.0332808
2) lm We can alternately formulate a single lm
model with separate intercepts and slopes. This creates separate intercept and slope model matrix columns within each group. Look at model.matrix(y ~ (strata:category)/(x + 1) - 1, df)
to get insight.
fm2 <- lm(y ~ (strata:category)/(x + 1) - 1, df)
summary(fm2)
giving:
Call:
lm(formula = y ~ (strata:category)/(x + 1) - 1, data = df)
Residuals:
1 2 3 4 5 6 7 8
0.24290 0.41073 -0.48580 -0.82145 0.24290 0.41073 0.18876 -0.02152
9 10 11 12
-0.37752 0.04304 0.18876 -0.02152
Coefficients:
Estimate Std. Error t value Pr(>|t|)
strata1:categoryA -0.6509 0.7596 -0.857 0.440
strata2:categoryA 4.0964 2.0343 2.014 0.114
strata1:categoryB -1.8043 0.9609 -1.878 0.134
strata2:categoryB 3.5787 2.2534 1.588 0.187
strata1:categoryA:x -0.0332 0.2224 -0.149 0.889
strata2:categoryA:x -0.4536 0.2224 -2.040 0.111
strata1:categoryB:x 0.3353 0.2224 1.507 0.206
strata2:categoryB:x -0.3562 0.2224 -1.602 0.184
Residual standard error: 0.629 on 4 degrees of freedom
Multiple R-squared: 0.7886, Adjusted R-squared: 0.3658
F-statistic: 1.865 on 8 and 4 DF, p-value: 0.2862
3) dplyr/broom
We can produce a list or tibble like this.
3a) group_map
library(broom)
library(dplyr)
df %>%
group_by(strata, category) %>%
group_map(~ tidy(lm(y ~ x, .)))
giving:
[[1]]
# A tibble: 2 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -0.651 0.719 -0.906 0.531
2 x -0.0332 0.210 -0.158 0.900
[[2]]
# A tibble: 2 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -1.80 1.54 -1.17 0.449
2 x 0.335 0.356 0.943 0.519
[[3]]
# A tibble: 2 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 4.10 1.50 2.74 0.223
2 x -0.454 0.163 -2.77 0.220
[[4]]
# A tibble: 2 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 3.58 0.189 18.9 0.0336
2 x -0.356 0.0186 -19.1 0.0333
3b) group_modify
library(broom)
library(dplyr)
df %>%
group_by(strata, category) %>%
group_modify(~ tidy(lm(y ~ x, .))) %>%
ungroup
giving:
# A tibble: 8 x 7
strata category term estimate std.error statistic p.value
<fct> <fct> <chr> <dbl> <dbl> <dbl> <dbl>
1 1 A (Intercept) -0.651 0.719 -0.906 0.531
2 1 A x -0.0332 0.210 -0.158 0.900
3 1 B (Intercept) -1.80 1.54 -1.17 0.449
4 1 B x 0.335 0.356 0.943 0.519
5 2 A (Intercept) 4.10 1.50 2.74 0.223
6 2 A x -0.454 0.163 -2.77 0.220
7 2 B (Intercept) 3.58 0.189 18.9 0.0336
8 2 B x -0.356 0.0186 -19.1 0.0333
4) listcompr Another approach is to use list comprehensions via the listcompr package. The first argument is a template for the component names, the second argument is the expression -- in this case the lm
call, and the remaining arguments define the indexes s
and c
.
library(listcompr)
with(df, gen.named.list(str = "{s}.{c}",
expr = lm(y ~ x, subset = strata == s & category == c),
s = levels(strata), c = levels(category)))
giving this named list of lm
objects:
$`1.A`
Call:
lm(formula = y ~ x, subset = strata == s & category == c)
Coefficients:
(Intercept) x
-0.6509 -0.0332
$`2.A`
Call:
lm(formula = y ~ x, subset = strata == s & category == c)
Coefficients:
(Intercept) x
4.0964 -0.4536
$`1.B`
Call:
lm(formula = y ~ x, subset = strata == s & category == c)
Coefficients:
(Intercept) x
-1.8043 0.3353
$`2.B`
Call:
lm(formula = y ~ x, subset = strata == s & category == c)
Coefficients:
(Intercept) x
3.5787 -0.3562