Say I want to estimate with lm()
the means of y
over k groups, where groups are defined by a factor.
If I just run lm(y ~ factor)
, this will give me an intercept, and the coefficient for the k-1 factors, but expressed as difference from the intercept. I want instead to have direct values of the means.
Is there a way to do this cleanly with contrast
in lm()
? I am not sure how this contrast would be called... orthogonal? I can obviously remove the intercept: lm(y ~ -1+ factor)
but this would give me wrong R2 values
reg1 <- lm(Sepal.Length~ Species, data= iris)
reg2 <- lm(Sepal.Length~ -1 + Species, data= iris)
## get coefs
coef(reg1) # not what I want
#> (Intercept) Speciesversicolor Speciesvirginica
#> 5.006 0.930 1.582
coef(reg2) # whay I want
#> Speciessetosa Speciesversicolor Speciesvirginica
#> 5.006 5.936 6.588
## THe models are equivalent:
all.equal(fitted(reg1), fitted(reg2))
#> [1] TRUE
# but the -1 trick will create problems for some stats, such as R2
summary(reg1)$r.squared
#> [1] 0.6187057
summary(reg2)$r.squared
#> [1] 0.9925426
Created on 2019-05-01 by the reprex package (v0.2.1)