Does anyone know if it is possible to use lmFit
or lm
in R to calculate a linear model with categorical variables while including all possible comparisons between the categories? For example in the test data created here:
set.seed(25)
f <- gl(n = 3, k = 20, labels = c("control", "low", "high"))
mat <- model.matrix(~f, data = data.frame(f = f))
beta <- c(12, 3, 6) #these are the simulated regression coefficient
y <- rnorm(n = 60, mean = mat %*% beta, sd = 2)
m <- lm(y ~ f)
I get the summary:
summary(m)
Call:
lm(formula = y ~ f)
Residuals:
Min 1Q Median 3Q Max
-4.3505 -1.6114 0.1608 1.1615 5.2010
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.4976 0.4629 24.840 < 2e-16 ***
flow 3.0370 0.6546 4.639 2.09e-05 ***
fhigh 6.1630 0.6546 9.415 3.27e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.07 on 57 degrees of freedom
Multiple R-squared: 0.6086, Adjusted R-squared: 0.5949
F-statistic: 44.32 on 2 and 57 DF, p-value: 2.446e-12
which is because the contrasts term ("contr.treatment") compares "high" to "control" and "low" to "control".
Is it possible to get also the comparison between "high" and "low"?