Problem: I would like to learn how can I change an order of samples for which Tukey's test in R calculates means and assign corresponding letters. Very simple example is below.
I played with iris data and have found that there are differences in Sepal.Length among different Species. Here is the boxplot:
I conducted an ANOVA test and have found that differences are statistically significant.
> fit <- lm(Sepal.Length ~ Species, data = iris)
> summary(aov(fit))
Df Sum Sq Mean Sq F value Pr(>F)
Species 2 63.21 31.606 119.3 <2e-16 ***
Residuals 147 38.96 0.265
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Then I conducted Tukey's test and got following:
> library(agricolae)
> HSD.test(fit, "Species", group=T, console=T)
Study: fit ~ "Species"
HSD Test for Sepal.Length
Mean Square Error: 0.2650082
Species, means
Sepal.Length std r Min Max
setosa 5.006 0.3524897 50 4.3 5.8
versicolor 5.936 0.5161711 50 4.9 7.0
virginica 6.588 0.6358796 50 4.9 7.9
alpha: 0.05 ; Df Error: 147
Critical Value of Studentized Range: 3.348424
Honestly Significant Difference: 0.2437727
Means with the same letter are not significantly different.
Groups, Treatments and means
a virginica 6.588
b versicolor 5.936
c setosa 5.006
According the table of groups, HSD.test function sorts means into descending order and then assign letters. Thus, "virginica" have the largest mean, therefore it is the first in the table.
Questions: Is there any way to change the default sorting and assigning of letters? Can I sort samples in ascending order of means and then assign letters. An expected output is following:
a setosa 5.006
b versicolor 5.936
c virginica 6.588
Possible solution: In package multcomp there are two functions which can do it working together:
1 - glht
do Tukey's test
> an <- aov(fit)
> library(multcomp)
> glht(an, linfct = mcp(Species = "Tukey"))
General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Linear Hypotheses:
Estimate
versicolor - setosa == 0 0.930
virginica - setosa == 0 1.582
virginica - versicolor == 0 0.652
2 - cld
can provide me with letters assigned to Species
according levels of the factor iris$Species
> cld(glht(an, linfct = mcp(Species = "Tukey")))
setosa versicolor virginica
"a" "b" "c"
Unfortunately, glht
function do not show another data which can be useful and needed for creating of barplot (means, std, p-values). Of course, I can do it separately with another special functions, or just use both HSD.test
and cld
. But I would prefer to solve a problem with sorting of means in HSD.test
function and use only this one.