How to separate compact letter display (CLD) in multcomp by group without changing the p-value adjustment method?

Question

Problem

I would like to plot estimated marginal means from a three-way factorial experiment with letters indicating significantly different means, adjusted for multiple comparisons. My current workflow is to fit the model with lmer(), calculate estimated marginal means with emmeans(), then implement the compact letter display algorithm with cld().

My problem is that the graph is too busy when you plot all three-way interactions on the same plot. So I would like to split up the plot and generate different sets of letters for each subplot, starting with "a". The problem is that when I use the by argument in cld to split it up, it does a separate correction for multiple comparisons within each by group. Because there are now fewer tests within each group, this results in a less conservative correction. But if I try to manually split up the output of cld() without a by group, I would have to manually re-implement the letter algorithm for each subplot. I guess I could do that but it seems cumbersome. I am trying to share this code with a client for him to modify later, so that solution would probably be too complex. Does anyone have an easy way to either:

Get the output of cld() to use one combined correction for all by groups.
Using a relatively simple method, reduce the compact letter display for each subgroup to the minimal necessary number of letters.

Reproducible example

Load packages and data.

library(lme4)
library(emmeans)
library(multcomp)

dat <- structure(list(y = c(2933.928571, 930.3571429, 210.7142857, 255.3571429, 
                            2112.5, 1835.714286, 1358.928571, 1560.714286, 9192.857143, 3519.642857, 
                            2771.428571, 7433.928571, 4444.642857, 3025, 3225, 2103.571429, 
                            3876.785714, 925, 1714.285714, 3225, 1783.928571, 2223.214286, 
                            2537.5, 2251.785714, 7326.785714, 5130.357143, 2539.285714, 6116.071429, 
                            5808.928571, 3341.071429, 2212.5, 7562.5, 3907.142857, 3241.071429, 
                            1294.642857, 4325, 4487.5, 2551.785714, 5648.214286, 3198.214286, 
                            1075, 335.7142857, 394.6428571, 1605.357143, 658.9285714, 805.3571429, 
                            1580.357143, 1575, 2037.5, 1721.428571, 1014.285714, 2994.642857, 
                            2116.071429, 800, 2925, 3955.357143, 9075, 3917.857143, 2666.071429, 
                            6141.071429, 3925, 1626.785714, 2864.285714, 7271.428571, 3432.142857, 
                            1826.785714, 514.2857143, 1319.642857, 1782.142857, 2637.5, 1355.357143, 
                            3328.571429, 1914.285714, 817.8571429, 1896.428571, 2121.428571, 
                            521.4285714, 360.7142857, 1114.285714, 1139.285714, 7042.857143, 
                            2371.428571, 2287.5, 4967.857143, 2180.357143, 1944.642857, 2408.928571, 
                            5289.285714, 7028.571429, 3080.357143, 5394.642857, 5973.214286, 
                            7323.214286, 1419.642857, 1455.357143, 4657.142857, 7069.642857, 
                            2451.785714, 4319.642857, 5562.5, 3953.571429, 1182.142857, 1957.142857, 
                            3796.428571, 1773.214286, 400, 871.4285714, 842.8571429, 657.1428571, 
                            1360.714286, 1853.571429, 1826.785714, 3405.357143, 2605.357143, 
                            5983.928571, 4935.714286, 4105.357143, 7666.071429, 3619.642857, 
                            5085.714286, 1592.857143, 1751.785714, 5992.857143, 2987.5, 794.6428571, 
                            3187.5, 825, 3244.642857), f1 = structure(c(4L, 4L, 4L, 4L, 4L, 
                                                                        4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
                                                                        3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
                                                                        2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 
                                                                        3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 
                                                                        3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 
                                                                        2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
                                                                        2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 
                                                                        1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", 
                                                                                                                                "B", "C", "D"), class = "factor"), f2 = structure(c(2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
                                                                                                                                                                                    1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
                                                                                                                                                                                    1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
                                                                                                                                                                                    2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("foo", 
                                                                                                                                                                                                                                                    "bar"), class = "factor"), f3 = structure(c(4L, 3L, 2L, 1L, 3L, 
                                                                                                                                                                                                                                                                                                4L, 1L, 2L, 4L, 2L, 1L, 3L, 3L, 2L, 4L, 1L, 3L, 1L, 4L, 2L, 2L, 
                                                                                                                                                                                                                                                                                                4L, 3L, 1L, 2L, 4L, 1L, 3L, 2L, 3L, 1L, 4L, 3L, 4L, 1L, 2L, 3L, 
                                                                                                                                                                                                                                                                                                2L, 4L, 1L, 2L, 1L, 3L, 4L, 1L, 2L, 4L, 3L, 2L, 1L, 3L, 4L, 3L, 
                                                                                                                                                                                                                                                                                                1L, 4L, 2L, 4L, 2L, 3L, 1L, 1L, 3L, 2L, 4L, 3L, 4L, 1L, 2L, 1L, 
                                                                                                                                                                                                                                                                                                4L, 3L, 2L, 3L, 1L, 4L, 2L, 1L, 3L, 4L, 2L, 4L, 3L, 1L, 2L, 1L, 
                                                                                                                                                                                                                                                                                                3L, 4L, 2L, 3L, 1L, 4L, 2L, 4L, 1L, 3L, 2L, 2L, 3L, 4L, 1L, 4L, 
                                                                                                                                                                                                                                                                                                1L, 2L, 3L, 4L, 1L, 3L, 2L, 1L, 2L, 4L, 3L, 1L, 2L, 4L, 3L, 1L, 
                                                                                                                                                                                                                                                                                                4L, 2L, 3L, 1L, 3L, 4L, 2L, 1L, 3L, 2L, 4L), .Label = c("L1", 
                                                                                                                                                                                                                                                                                                                                                        "L2", "L3", "L4"), class = "factor"), block = structure(c(1L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
                                                                                                                                                                                                                                                                                                                                                                                                                  4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("1", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          "2", "3", "4"), class = "factor")), row.names = c(NA, -128L), class = "data.frame")

Fit model and get estimated marginal means.

fit <- lmer(log10(y) ~ f1 * f2 * f3 + (1 | block), data = dat)
emm <- emmeans(fit, ~ f1 + f2 + f3, mode = 'Kenward-Roger', type = 'response')

Version 1

In this version, I take the CLD as a whole which correctly uses the Sidak adjustment for 496 tests. However let's say I wanted to plot only those rows where f2 == 'bar'. The letters are no longer correct because some are redundant (less than 8 are needed). Is there any function that can reduce the letters down?

cldisplay1 <- cld(emm, adjust = 'sidak', Letters = letters)
subset(as.data.frame(cldisplay1), f2 == 'bar') # correct comparisons but contains redundant letters

output

   f1  f2 f3  response        SE df  lower.CL   upper.CL    .group
8   D bar L1  365.6732   76.1231 96  185.9699   719.0244  a       
24  D bar L3  582.8573  121.3349 96  296.4229  1146.0742  ab      
16  D bar L2  682.9238  142.1659 96  347.3136  1342.8353  ab      
7   C bar L1  898.1560  186.9714 96  456.7740  1766.0470  abcd    
6   B bar L1 1627.7069  338.8438 96  827.8006  3200.5652   bcdefg 
15  C bar L2 1635.4393  340.4534 96  831.7330  3215.7694   bcdefg 
32  D bar L4 1746.6052  363.5951 96  888.2685  3434.3552   bcdefg 
31  C bar L4 2348.6629  488.9270 96 1194.4562  4618.1832    cdefgh
21  A bar L3 2499.6772  520.3640 96 1271.2573  4915.1230    cdefgh
5   A bar L1 2545.4594  529.8946 96 1294.5407  5005.1448    cdefgh
23  C bar L3 2561.0138  533.1326 96 1302.4512  5035.7294    cdefgh
30  B bar L4 3158.6969  657.5538 96 1606.4140  6210.9556      efgh
22  B bar L3 3364.9438  700.4887 96 1711.3047  6616.4994      efgh
14  B bar L2 3411.4009  710.1598 96 1734.9313  6707.8482      efgh
13  A bar L2 3769.4223  784.6900 96 1917.0098  7411.8269      efgh
29  A bar L4 7006.3740 1458.5342 96 3563.2217 13776.6551         h

Version 2

In this version, I use the by argument to cld() to split by f2. This reduces the letters within each group, but the Sidak adjustment is now less conservative. For example, row 8 and row 16 are not significantly different at the adjusted alpha-level from the comparison above, but now they are different. But I do not want to change the tests used, just to plot only a subset of the data. Is there a way to specify the number of tests I'm adjusting for as a whole, even though cld is split up with by groups?

cldisplay2 <- cld(emm, adjust = 'sidak', by = 'f2', Letters = letters)
subset(as.data.frame(cldisplay2), f2 == 'bar')

output

   f1  f2 f3  response        SE df  lower.CL   upper.CL  .group
8   D bar L1  365.6732   76.1231 96  185.9699   719.0244  a     
24  D bar L3  582.8573  121.3349 96  296.4229  1146.0742  ab    
16  D bar L2  682.9238  142.1659 96  347.3136  1342.8353  abc   
7   C bar L1  898.1560  186.9714 96  456.7740  1766.0470  abcd  
6   B bar L1 1627.7069  338.8438 96  827.8006  3200.5652   bcde 
15  C bar L2 1635.4393  340.4534 96  831.7330  3215.7694   bcde 
32  D bar L4 1746.6052  363.5951 96  888.2685  3434.3552    cde 
31  C bar L4 2348.6629  488.9270 96 1194.4562  4618.1832     de 
21  A bar L3 2499.6772  520.3640 96 1271.2573  4915.1230     def
5   A bar L1 2545.4594  529.8946 96 1294.5407  5005.1448     def
23  C bar L3 2561.0138  533.1326 96 1302.4512  5035.7294     def
30  B bar L4 3158.6969  657.5538 96 1606.4140  6210.9556      ef
22  B bar L3 3364.9438  700.4887 96 1711.3047  6616.4994      ef
14  B bar L2 3411.4009  710.1598 96 1734.9313  6707.8482      ef
13  A bar L2 3769.4223  784.6900 96 1917.0098  7411.8269      ef
29  A bar L4 7006.3740 1458.5342 96 3563.2217 13776.6551       f

I don't get it. You want to compare all 32 means, right? So just show the table with 32 rows. If you break it into 2 tables of 16 and start over with the grouping letters in each table, then you are not correctly displaying the comparison between a row in the first table and a row in the 2nd table. If yu don't want to show those comparisons, then you don't need to adjust for them, and you can go ahead and use the `by` variable. — Russ Lenth, Feb 02 '22 at 14:50
@RussLenth thanks for answering. I'd like to show it in plot form, with two separate plots, say one for `f2=='foo'` and one for `f2=='bar'`. I do not need the letters to correctly display comparisons between a mean in the first plot and a mean in the second plot. But I do want to adjust for all the comparisons as a whole and have the separated plots be consistent with the overall familywise error rate adjustment for 496 tests. Does that make sense? — qdread, Feb 02 '22 at 16:30
What do you mean by "plot"? Specifically, by that do you mean the compact letter displays? Or are you also showing the comparisons some other way? If not, it does NOT make sense to me to apply a multiplicity adjustment that includes tests that you never perform (or can't be discerned). Doing so is essentially reducing your overall alpha to something considerably less than what is stated, thus being unnecessarily conservative. — Russ Lenth, Feb 02 '22 at 18:02
I mean a graphical version of the CLD where the column `response` is shown as a point, with error bars extending from `lower.CL` to `upper.CL` for each group EMM. The letters appear by each group. The only wrinkle is to separate into >1 graph, keeping the correct adjustment but with fewer letters necessary for a given graph. — qdread, Feb 02 '22 at 18:55

Russ Lenth · Accepted Answer · 2022-02-03T03:44:20.047

With the two separate tables (or plots?) you are displaying a total of 90 + 90 = 180 comparisons. If you want an overall multiplicity adjustment for all of these 180 comparisons, you need to be considerably less conservative than for 496 comparisons. However, it is possible to speccify a different value of level so that the Sidak adjustment works out correctly. For example, if you want the overall alpha to be 0.05, use

cld(emm, adjust = 'sidak', by = 'f2', Letters = letters, 
    alpha = 1 - sqrt(0.95))

With this, you are specifying alpha = 0.02532. Note that if

p.adj  =  1 - (1 - p)^90  <  1 - sqrt(.95)

then

(1 - p)^90 > sqrt(.95)

so that

(1 - p)^180 > .95

thus

1 - (1 - p)^180  < .05

That is, by splitting the CLD table into two parts showing 90 comparisons each, we correctly apply the Sidak adjustment to correct for the 180 comparisons total at a significance level of .05.

Enhancement

Another idea based on this that results in a less conservative adjustment is to specify the Tukey adjustment instead:

cld(emm, adjust = 'tukey', by = 'f2', Letters = letters, 
    alpha = 1 - sqrt(0.95))

Thus, each separate table has an exact familywise error rate of 1 - sqrt(0.05); and we used the Sidak adjustment (slightly conservative) so that the error rate for the whole family of 180 tests is less than 0.05.

@qdread please consider this answer. I believe it corrects for the multiplicity of the comparisons shown. The same idea could be adapted to correct for 49 tests, but I won't show the details because I believe it's wrong to do that. — Russ Lenth, Feb 03 '22 at 01:35
Hey I just haven't had a chance to check it thoroughly yet. I will do that later today and most likely upvote&accept! — qdread, Feb 03 '22 at 13:17

How to separate compact letter display (CLD) in multcomp by group without changing the p-value adjustment method?

Problem

Reproducible example

Version 1

output

Version 2

output

1 Answers1

Enhancement