Run ANOVA, pairwise test and Compact Letter Display on plots on several dependant variables

Question

I have a dataset that looks like this, but with a few dozen more dependant variables.

set.seed(108)
test <- data.frame(
    n = 1:12,
    treatment = factor(paste("trt", 1:2)),
    type = sample(LETTERS, 3),
    var1 = sample(1:100, 12),
    var2 = sample(1:100, 12),
    var3 = sample(1:100, 12),
    var4 = sample(1:100, 12))

I would like to run a two-way ANOVA (effect of treatment and type on each of the dependant variables), and I am trying to do it automatically. Eventually, I'd like to plot barplots of each of the dependant variables including a compact letter display of the significance letters on each of the barplots. The letters would result from the ANOVA and pairwise comparison test, Example: https://statdoe.com/barplot-for-two-factors-in-r/ , section: Adding the compact letter display).

Could somebody give me hints to run this automatically? Or should I just give up and do it manually?

Paul Schmidt · Answer 1 · 2022-09-01T09:12:34.893

Does this work for you? If so, please also read my notes below.

# packages and function conflicts
library(broom)
library(conflicted)
library(emmeans)
library(multcomp)
library(multcompView)
library(tidyverse)

conflict_prefer("select", winner = "dplyr")
#> [conflicted] Will prefer dplyr::select over any other package

# data
set.seed(108)
test <- data.frame(
  n = 1:12,
  treatment = factor(paste("trt", 1:2)),
  type = sample(LETTERS, 3),
  var1 = sample(1:100, 12),
  var2 = sample(1:100, 12),
  var3 = sample(1:100, 12),
  var4 = sample(1:100, 12))

# Loop setup
var_names <- test %>% select(contains("var")) %>% names()
emm_list <- list()
anova_list <- list()

# Loop
for (var_i in var_names) {
  test_i <- test %>%
    rename(y_i = !!var_i) %>%
    select(treatment, type, y_i)
  
  mod_i <- lm(y_i ~ treatment*type, data = test_i)
  
  anova_list[[var_i]] <- mod_i %>% anova %>% tidy()
  
  emm_i <- emmeans(mod_i, ~treatment:type) %>%
    cld(Letters = letters)
  
  emm_list[[var_i]] <- as_tibble(emm_i)
}

# Combine emmean results
emm_out <- bind_rows(emm_list, .id = "id")

# Plot results
ggplot() +
  facet_wrap(~ id) +
  theme_classic() +
  # mean
  geom_point(
    data = emm_out,
    aes(y = emmean, x = type, color = treatment),
    shape = 16,
    alpha = 0.5,
    position = position_dodge(width = 0.2)
  ) +
  # red mean errorbar
  geom_errorbar(
    data = emm_out,
    aes(ymin = lower.CL, ymax = upper.CL, x = type, color = treatment),
    width = 0.05,
    position = position_dodge(width = 0.2)
  ) +
  # red letters
  geom_text(
    data = emm_out,
    aes(
      y = emmean,
      x = type, 
      color = treatment,
      label = str_trim(.group)
    ),
    position = position_nudge(x = 0.1),
    hjust = 0,
    show.legend = FALSE
  )

^{Created on 2022-09-01 with reprex v2.0.2}

First of all, this question is not a duplicate of, but similar to the question "loop Tukey post hoc letters extraction".
Then note that you are talking about two separate things: ANOVA results and pairwise comparison test results. The former is stored in anova_list and the second in emm_list/emm_out. Only the second can get you the compact letter display and I did that here via cld(). Pleas see this chapter on compact letter displays for more details. Note that in that chapter I also provide code for bar charts instead of the one above, but also explain why bar charts may not be the best choice.
The topic of compact letter displays is a bit more complex when you have two factors and especially interactions between them. I was assuming that you always want to look at the interaction term, but maybe check out this answer for details on what your options are.

Run ANOVA, pairwise test and Compact Letter Display on plots on several dependant variables

1 Answers1