0

I am trying to get annotations from a Tukey's test on to a facet'ed ggplot. I've tried multiple other answers (1,2), but these are not working due to the glm interaction or that I have groups with separate treatments. I'm not sure.

Some sample data

Raw_data <- data.frame(Sample = rep(1:48, 4),
                       Age = c(rep(c(rep("1do",16),rep("8wo",16),rep("11mo",16)),4)),
                       Treatment = c(rep(c(rep("Control",8),rep("Treated",8)),12)),
                       Gene = c(rep("Gene1",48),rep("Gene2",48),rep("Gene3",48),rep("Gene4",48)),
                       FC = c(rnorm(8, mean=1, sd=0.1),rnorm(8, mean=2.5, sd=0.1),
                              rnorm(8, mean=1.5, sd=0.1),rnorm(8, mean=1.5, sd=0.1),
                              rnorm(8, mean=1, sd=0.1), rnorm(8, mean=1, sd=0.1),
                              rnorm(8, mean=3, sd=0.1),rnorm(8, mean=1, sd=0.1),
                              rnorm(8, mean=1.5, sd=0.1),rnorm(8, mean=0.5, sd=0.1),
                              rnorm(8, mean=0.2, sd=0.1), rnorm(8, mean=1, sd=0.1),
                              rnorm(8, mean=4, sd=0.1),rnorm(8, mean=2.5, sd=0.1),
                              rnorm(8, mean=1.5, sd=0.1),rnorm(8, mean=0.5, sd=0.1),
                              rnorm(8, mean=1.1, sd=0.1), rnorm(8, mean=1.3, sd=0.1),
                              rnorm(8, mean=0.6, sd=0.1),rnorm(8, mean=1.9, sd=0.1),
                              rnorm(8, mean=1.5, sd=0.1),rnorm(8, mean=3.5, sd=0.1),
                              rnorm(8, mean=2.2, sd=0.1), rnorm(8, mean=0.3, sd=0.1)))
Raw_data$Age <- fct_relevel(Raw_data$Age, c("1do","8wo","11mo"))

And the plot:

Test_plot <- ggplot(data = Raw_data,
                          aes(x = Age, y = log2(FC), fill = Treatment)) +
  geom_boxplot() +
  stat_boxplot(geom ='errorbar', width = 0.4,
               position = position_dodge(width = 0.75)) +
  geom_point(aes(group = Treatment),
             position = position_dodge(width = 0.75),
             pch = 1, size = 4, alpha = 0.5) +
  stat_summary(aes(group = Treatment),
               position = position_dodge(width = 0.75),
               geom = "point",
               fun = "mean",
               colour = "black",
               size = 2,
               pch =24,
               fill = "red") +
  theme_bw() +
  theme(axis.title.x=element_blank(), 
        text = element_text(size=15),  
        plot.title = element_text(size=15, hjust = 0.5), 
        axis.text.x = element_text(size=10), 
        axis.text.y = element_text(size=8))+ 
  facet_wrap(Gene ~ ., ncol = 4, scales = "free") +
  ylab(expression(Log[2] ~ Fold ~ Change))

Test_plot

Gives this: Plot

I can get the confidence interval plots one gene at a time using:

test_stats_Tukey <- Raw_data %>%
  nest_by(Gene) %>%
  mutate(Model = list(glm(FC ~ Treatment * Age, data = data)))

par(mar=c(5,13,4,1)+.1)
plot(TukeyHSD(x= aov(test_stats_Tukey[[3]][[1]]), conf.level=0.95) , las=1 , col="brown")

Which gives:

CI plot

But I'd like to extract and plot annotations in facets automatically.

Any help would be great. Thank you!

Elk
  • 491
  • 2
  • 9

2 Answers2

0

It's not pretty, but it gets the job done!

Final_ann <- data.frame()
for (ann_facet in as.character(unique(Raw_data$Gene))) {
  temp_ann_data <- Raw_data[which(Raw_data$Gene == ann_facet),]
  temp_ann_model <- glm(FC ~ Treatment * Age, 
               data = temp_ann_data)
  
  temp_ann_pairs <- emmeans(temp_ann_model, 
                                 pairwise ~ Treatment*Age, 
                                 adjust = "tukey")
  
  temp_ann_model_cld <- cld(temp_ann_pairs$emmeans, 
                                 alpha = .05,
                                 Letters = letters)
  
  temp_ann_letters <- arrange(temp_ann_model_cld, Age, Treatment)
  colnames(temp_ann_letters)[8] <- "Annotation"
  temp_ann_letters$Annotation <- gsub("[[:space:]]", "",as.character(temp_ann_letters$Annotation))
  temp_ann_letters$Gene <- ann_facet
  temp_ann_letters$FC <-1.25*max(temp_ann_data$FC)
  Final_ann <- rbind(Final_ann, temp_ann_letters[,c(1,2,8,9,10)])
}

Test_plot + 
  geom_text(data = Final_ann,  
            aes(x = Age, y = log2(FC), group = Treatment, label=Annotation), 
            size=3, position = position_dodge(width = 0.75)) +
  facet_wrap(Gene ~ ., ncol = 4, scales = "free")

Success

Elk
  • 491
  • 2
  • 9
0

Please check out my answer here, where I went into a bit of detail on your options.

The outcome there that is closest to the one you are describing is this one: enter image description here

Paul Schmidt
  • 1,072
  • 10
  • 23
  • 1
    Excellent post. I'll be keeping my code but modifying to have the estimate and spread from your code. However I'm accepting this answer as the answer in the link is very complete! – Elk Jul 18 '22 at 16:54