1

I would like to use Bonferroni-adjusted p-values in a ggplot showing comparison bars, but I can't seem to figure it out.

If I use t.test method...

ggplot(iris, aes(x = Species, y = Sepal.Length)) +
  geom_boxplot() + # using `ggsignif` to display comparison of interest
  geom_signif(
    comparisons = list(c("versicolor", "virginica"),c('versicolor','setosa'), c('virginica', 'setosa')),
    map_signif_level = TRUE,
    test = t.test,
    step_increase = .1,
    margin_top = 0.1,
    vjust = 0.5,
    textsize = 3)

...then I can't add the bonferroni correction.

If I instead use the t_test method (which DOES include a way to implement the bonferroni)...

ggplot(iris, aes(x = Species, y = Sepal.Length)) +
  geom_boxplot() + # using `ggsignif` to display comparison of interest
  geom_signif(
    comparisons = list(c("versicolor", "virginica"),c('versicolor','setosa'), c('virginica', 'setosa')),
    map_signif_level = TRUE,
    test = t_test, # This method doesn't work inside geom_signif()
    p.adjust.method = "bonferroni" 
    step_increase = .1,
    margin_top = 0.1,
    vjust = 0.5,
    textsize = 3)

...it doesn't work at all, because t_test does not work in this context.

Any suggestions on how to include bonferroni-adjusted p-values on a ggplot?

TIA

JayPeerachai
  • 3,499
  • 3
  • 14
  • 29
datakritter
  • 590
  • 5
  • 19
  • Does it really have to be `ggplot2`? I could provide a solution using low level graphics. – jay.sf Dec 31 '21 at 22:55
  • 1
    Perhaps not, but maintenance will be easier in ggplot since most of my workplace is familiar with that. I'm open to a base R solution if it is straightforward enough that we can maintain it. – datakritter Dec 31 '21 at 23:23

1 Answers1

2

Here comes a base R solution. First, do a pairwise.t.test of value variable on group variable, while specifying the method for the adjustment.

p1 <- with(iris2, pairwise.t.test(Sepal.Length, Species, 
                                  p.adjust.method="bonferroni"))$p.value

Then make a boxplot() and in mapply's use arrows() for the ranges and text to put something on it, e.g. p-values. Here a rather hard-coded example; the logic is the following: x-values* are on 1, 2, ... n levels (of "Species" in this case), and y-values* are "above" the boxes, whose upper limit is max(Sepal.Length). The x, y values are put into vectors.

*See documentations ?arrows, ?text for further details.

ylim. <- with(iris, range(Sepal.Length)) + c(0, 1)  ## define ylim

boxplot(Sepal.Length ~ Species, iris2, ylim=ylim.)
mapply(\(w, x, y, z) arrows(w, x, y, z, code=3, angle=90, length=.05),
       c(1, 2, 1), c(7.75, 8, 8.25), c(2, 3, 3), c(7.75, 8, 8.25))
mapply(\(x, y, z) text(x, y, bquote(italic(p)*'='*.(z)), cex=.7),
       c(1.5, 2, 2.5), c(7.85, 8.35, 8.1), formatC(p1[!is.na(p1)], 3, fo='f'))

Note: R >= 4.1 used.

Gives:

enter image description here

If you prefer significance stars, you could try the following,

c('', '*', '**', '***')[rowSums(outer(p1[!is.na(p1)], 
                                      c(1, .05, .01, .001), `<`))]

and put it as z in the text-mapply.


Data:

set.seed(42)
iris2 <- iris[sample(nrow(iris), 21), ]
jay.sf
  • 60,139
  • 8
  • 53
  • 110