Problem
I would like to plot estimated marginal means from a three-way factorial experiment with letters indicating significantly different means, adjusted for multiple comparisons. My current workflow is to fit the model with lmer()
, calculate estimated marginal means with emmeans()
, then implement the compact letter display algorithm with cld()
.
My problem is that the graph is too busy when you plot all three-way interactions on the same plot. So I would like to split up the plot and generate different sets of letters for each subplot, starting with "a". The problem is that when I use the by
argument in cld
to split it up, it does a separate correction for multiple comparisons within each by
group. Because there are now fewer tests within each group, this results in a less conservative correction. But if I try to manually split up the output of cld()
without a by
group, I would have to manually re-implement the letter algorithm for each subplot. I guess I could do that but it seems cumbersome. I am trying to share this code with a client for him to modify later, so that solution would probably be too complex. Does anyone have an easy way to either:
- Get the output of
cld()
to use one combined correction for allby
groups. - Using a relatively simple method, reduce the compact letter display for each subgroup to the minimal necessary number of letters.
Reproducible example
Load packages and data.
library(lme4)
library(emmeans)
library(multcomp)
dat <- structure(list(y = c(2933.928571, 930.3571429, 210.7142857, 255.3571429,
2112.5, 1835.714286, 1358.928571, 1560.714286, 9192.857143, 3519.642857,
2771.428571, 7433.928571, 4444.642857, 3025, 3225, 2103.571429,
3876.785714, 925, 1714.285714, 3225, 1783.928571, 2223.214286,
2537.5, 2251.785714, 7326.785714, 5130.357143, 2539.285714, 6116.071429,
5808.928571, 3341.071429, 2212.5, 7562.5, 3907.142857, 3241.071429,
1294.642857, 4325, 4487.5, 2551.785714, 5648.214286, 3198.214286,
1075, 335.7142857, 394.6428571, 1605.357143, 658.9285714, 805.3571429,
1580.357143, 1575, 2037.5, 1721.428571, 1014.285714, 2994.642857,
2116.071429, 800, 2925, 3955.357143, 9075, 3917.857143, 2666.071429,
6141.071429, 3925, 1626.785714, 2864.285714, 7271.428571, 3432.142857,
1826.785714, 514.2857143, 1319.642857, 1782.142857, 2637.5, 1355.357143,
3328.571429, 1914.285714, 817.8571429, 1896.428571, 2121.428571,
521.4285714, 360.7142857, 1114.285714, 1139.285714, 7042.857143,
2371.428571, 2287.5, 4967.857143, 2180.357143, 1944.642857, 2408.928571,
5289.285714, 7028.571429, 3080.357143, 5394.642857, 5973.214286,
7323.214286, 1419.642857, 1455.357143, 4657.142857, 7069.642857,
2451.785714, 4319.642857, 5562.5, 3953.571429, 1182.142857, 1957.142857,
3796.428571, 1773.214286, 400, 871.4285714, 842.8571429, 657.1428571,
1360.714286, 1853.571429, 1826.785714, 3405.357143, 2605.357143,
5983.928571, 4935.714286, 4105.357143, 7666.071429, 3619.642857,
5085.714286, 1592.857143, 1751.785714, 5992.857143, 2987.5, 794.6428571,
3187.5, 825, 3244.642857), f1 = structure(c(4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A",
"B", "C", "D"), class = "factor"), f2 = structure(c(2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("foo",
"bar"), class = "factor"), f3 = structure(c(4L, 3L, 2L, 1L, 3L,
4L, 1L, 2L, 4L, 2L, 1L, 3L, 3L, 2L, 4L, 1L, 3L, 1L, 4L, 2L, 2L,
4L, 3L, 1L, 2L, 4L, 1L, 3L, 2L, 3L, 1L, 4L, 3L, 4L, 1L, 2L, 3L,
2L, 4L, 1L, 2L, 1L, 3L, 4L, 1L, 2L, 4L, 3L, 2L, 1L, 3L, 4L, 3L,
1L, 4L, 2L, 4L, 2L, 3L, 1L, 1L, 3L, 2L, 4L, 3L, 4L, 1L, 2L, 1L,
4L, 3L, 2L, 3L, 1L, 4L, 2L, 1L, 3L, 4L, 2L, 4L, 3L, 1L, 2L, 1L,
3L, 4L, 2L, 3L, 1L, 4L, 2L, 4L, 1L, 3L, 2L, 2L, 3L, 4L, 1L, 4L,
1L, 2L, 3L, 4L, 1L, 3L, 2L, 1L, 2L, 4L, 3L, 1L, 2L, 4L, 3L, 1L,
4L, 2L, 3L, 1L, 3L, 4L, 2L, 1L, 3L, 2L, 4L), .Label = c("L1",
"L2", "L3", "L4"), class = "factor"), block = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("1",
"2", "3", "4"), class = "factor")), row.names = c(NA, -128L), class = "data.frame")
Fit model and get estimated marginal means.
fit <- lmer(log10(y) ~ f1 * f2 * f3 + (1 | block), data = dat)
emm <- emmeans(fit, ~ f1 + f2 + f3, mode = 'Kenward-Roger', type = 'response')
Version 1
In this version, I take the CLD as a whole which correctly uses the Sidak adjustment for 496 tests. However let's say I wanted to plot only those rows where f2 == 'bar'
. The letters are no longer correct because some are redundant (less than 8 are needed). Is there any function that can reduce the letters down?
cldisplay1 <- cld(emm, adjust = 'sidak', Letters = letters)
subset(as.data.frame(cldisplay1), f2 == 'bar') # correct comparisons but contains redundant letters
output
f1 f2 f3 response SE df lower.CL upper.CL .group
8 D bar L1 365.6732 76.1231 96 185.9699 719.0244 a
24 D bar L3 582.8573 121.3349 96 296.4229 1146.0742 ab
16 D bar L2 682.9238 142.1659 96 347.3136 1342.8353 ab
7 C bar L1 898.1560 186.9714 96 456.7740 1766.0470 abcd
6 B bar L1 1627.7069 338.8438 96 827.8006 3200.5652 bcdefg
15 C bar L2 1635.4393 340.4534 96 831.7330 3215.7694 bcdefg
32 D bar L4 1746.6052 363.5951 96 888.2685 3434.3552 bcdefg
31 C bar L4 2348.6629 488.9270 96 1194.4562 4618.1832 cdefgh
21 A bar L3 2499.6772 520.3640 96 1271.2573 4915.1230 cdefgh
5 A bar L1 2545.4594 529.8946 96 1294.5407 5005.1448 cdefgh
23 C bar L3 2561.0138 533.1326 96 1302.4512 5035.7294 cdefgh
30 B bar L4 3158.6969 657.5538 96 1606.4140 6210.9556 efgh
22 B bar L3 3364.9438 700.4887 96 1711.3047 6616.4994 efgh
14 B bar L2 3411.4009 710.1598 96 1734.9313 6707.8482 efgh
13 A bar L2 3769.4223 784.6900 96 1917.0098 7411.8269 efgh
29 A bar L4 7006.3740 1458.5342 96 3563.2217 13776.6551 h
Version 2
In this version, I use the by
argument to cld()
to split by f2
. This reduces the letters within each group, but the Sidak adjustment is now less conservative. For example, row 8 and row 16 are not significantly different at the adjusted alpha-level from the comparison above, but now they are different. But I do not want to change the tests used, just to plot only a subset of the data. Is there a way to specify the number of tests I'm adjusting for as a whole, even though cld
is split up with by
groups?
cldisplay2 <- cld(emm, adjust = 'sidak', by = 'f2', Letters = letters)
subset(as.data.frame(cldisplay2), f2 == 'bar')
output
f1 f2 f3 response SE df lower.CL upper.CL .group
8 D bar L1 365.6732 76.1231 96 185.9699 719.0244 a
24 D bar L3 582.8573 121.3349 96 296.4229 1146.0742 ab
16 D bar L2 682.9238 142.1659 96 347.3136 1342.8353 abc
7 C bar L1 898.1560 186.9714 96 456.7740 1766.0470 abcd
6 B bar L1 1627.7069 338.8438 96 827.8006 3200.5652 bcde
15 C bar L2 1635.4393 340.4534 96 831.7330 3215.7694 bcde
32 D bar L4 1746.6052 363.5951 96 888.2685 3434.3552 cde
31 C bar L4 2348.6629 488.9270 96 1194.4562 4618.1832 de
21 A bar L3 2499.6772 520.3640 96 1271.2573 4915.1230 def
5 A bar L1 2545.4594 529.8946 96 1294.5407 5005.1448 def
23 C bar L3 2561.0138 533.1326 96 1302.4512 5035.7294 def
30 B bar L4 3158.6969 657.5538 96 1606.4140 6210.9556 ef
22 B bar L3 3364.9438 700.4887 96 1711.3047 6616.4994 ef
14 B bar L2 3411.4009 710.1598 96 1734.9313 6707.8482 ef
13 A bar L2 3769.4223 784.6900 96 1917.0098 7411.8269 ef
29 A bar L4 7006.3740 1458.5342 96 3563.2217 13776.6551 f