How to plot significance in ggplot geom_bar() with multiple facets and groups?

Question

I created a multi-facet plot with 4 groups of 3 bars each in every panel using ggplot2.

Main bar plot

I then ran a separate test to see if there are statistically significant differences between each combination of the bar categories in each age group and in each facet, which produced a separate data.frame p.val.df. I now need to figure out how to get significance bars to appear between the bars, just like in this answer but additionally within each age group. This is where I am running aground. Since I already have my p-values, I don't need to calculate them using geom_signif() of the ggsignif package, but would rather just use geom_bracket of ggpubr package to plot them. But any way you could make it work is acceptable.

Here's the data and code:

library(ggplot2)
library(ggpubr)

# Main data
df <- data.frame(
  factor(rep(c("A1", "A2"), each = 12), levels = c("A1", "A2")),
  factor(rep(c("G", "M", "B"), each = 4), levels = c("G", "M", "B")),
  factor(rep(c("0-2", "3-5", "6-12", "13-17"), 6), levels = c("0-2", "3-5", "6-12", "13-17")),
  c(160, 162, 169, 108, 110, 111, 76, 73, 76, 45, 41, 38, 175, 
    177, 173, 174, 167, 172, 176, 162, 166, 143, 130, 143))

colnames(df) <- c("Class", "Type", "Age", "Coefficient")

# Intergroup difference significance
p.val.df <- data.frame(
  factor(rep(c("A1", "A2"), each = 12), levels = c("A1", "A2")),
  factor(rep(c("G", "M", "G"), each = 4), levels = c("G", "M")),
  factor(rep(c("B", "B", "M"), each = 4), levels = c("B", "M")),
  factor(rep(c("0-2", "3-5", "6-12", "13-17"), 6), levels = c("0-2", "3-5", "6-12", "13-17")),
  c(0.635, 0.584, 0.268, 0.051, 0.163, 0.779, 0.302, 0.361, 0.055, 0.425, 0.998, 0.055,
    0.707, 0.230, 0.000, 0.002, 0.418, 0.313, 0.211, 0.037, 0.675, 0.764, 0.011, 0.881))

colnames(p.val.df) <- c("Class", "Type1", "Type2", "Age", "p.value")

# Plotting
ggplot(df, aes(x = Age, y = Coefficient, fill = Type)) + 
  geom_bar(position = "dodge", stat = "identity") + 
  labs(y = "Coefficient", x = "Age", fill = "Type") + 
  facet_wrap( ~ Class, scales = "free") + 
  expand_limits(y = 300) +
  theme_classic() +

### NOT SURE HOW TO PROCEED HERE
 geom_bracket(
    data = p.val.df, y.position = 250, step.increase = 0.1,
    aes(xmin = Type1, xmax = Type2, label = signif(p.value, 2)))

Welcome to SO. Have you considered that such a visualisation may get very crowded ? I.e. very difficult to read? Maybe a different type of visualisation (or: statistical approach, e.g. more complex regression models) may be more appropriate. Also, a lot of code is not quite necessary for the question (e.g., the entire calls to `theme` and `scale`). — tjebo, Dec 19 '19 at 13:13
@Tjebo Thank you so much. I have considered the fact that these could get crowded, but in actual fact they won't, given that only a few combinations of coefficients are statistically significant. In particular, I need to create these significance brackets in the specific scenario where I don't have access to data on which the post-regression statistics were run, so I only have the data with coefficients to plot and then a separate file with corresponding p-values. I agree that calls to ```theme``` and ```scale``` are not strictly required, but I would leave it to the discretion of the users. — Denys D., Dec 19 '19 at 13:32
Less is more: https://stackoverflow.com/help/how-to-ask and also https://stackoverflow.com/help/minimal-reproducible-example - reducing your code to the essential is not only courtesy to us who want to help, but it makes it much more likely that we are actually going to help — tjebo, Dec 19 '19 at 15:57
Based on this issue https://github.com/kassambara/ggpubr/issues/65 , a good workaround might be to plot Type on the x-axis, use Age as your legend (3 sets of 4 bars vs 4 sets of 3 bars), and then follow their method for displaying p values — DanStu, Dec 19 '19 at 16:37
Hi @DenysD. did you get this working in the end? I'm struggling with the same issue! — DS14, Sep 14 '20 at 12:03
@DS14 Unfortunately, no. There are ways to make the brackets appear between the x categories, but not within them (if there are multiple groups, like in my case). At least, I wasn't able to find any reasonable solution that would not be more effort than it's worth. I am still curious to see if someone has come up with a solution. — Denys D., Sep 15 '20 at 13:09
@DenysD. Shame! I'm dealing with the same issue - 4 different groups faceted by sex. (so 8 bars in total). I love the way my figure looks and I'm able to add all the other stats within the facets. I have a temporary work-around where I add significance values with a dot rather than an asterisk over the bars in one facet to denote differences between males and females. Not ideal but it gets the point across. — DS14, Sep 15 '20 at 13:13

How to plot significance in ggplot geom_bar() with multiple facets and groups?

0 Answers0