2

I love the split violin plot and @jan-glx 's awesome geom_split_violin function created here: Split violin plot with ggplot2.

I would love to add split boxplots and stats to this, as I explain below.

First, to be complete, here are the full data and code.

Data (copied from above link)

 set.seed(20160229)
 my_data = data.frame(
     y=c(rnorm(1000), rnorm(1000, 0.5), rnorm(1000, 1), rnorm(1000, 1.5)),
     x=c(rep('a', 2000), rep('b', 2000)),
     m=c(rep('i', 1000), rep('j', 2000), rep('i', 1000)))

Code to create geom_split_violin function (copied from above link)

 library('ggplot2')
 GeomSplitViolin <- ggproto("GeomSplitViolin", GeomViolin, 
                       draw_group = function(self, data, ..., draw_quantiles = NULL) {
    data <- transform(data, xminv = x - violinwidth * (x - xmin), xmaxv = x + violinwidth * (xmax - x))
   grp <- data[1, "group"]
   newdata <- plyr::arrange(transform(data, x = if (grp %% 2 == 1) xminv else xmaxv), if (grp %% 2 == 1) y else -y)
   newdata <- rbind(newdata[1, ], newdata, newdata[nrow(newdata), ], newdata[1, ])
   newdata[c(1, nrow(newdata) - 1, nrow(newdata)), "x"] <- round(newdata[1, "x"])
   if (length(draw_quantiles) > 0 & !scales::zero_range(range(data$y))) {
     stopifnot(all(draw_quantiles >= 0), all(draw_quantiles <=
       1))
     quantiles <- ggplot2:::create_quantile_segment_frame(data, draw_quantiles)
     aesthetics <- data[rep(1, nrow(quantiles)), setdiff(names(data), c("x", "y")), drop = FALSE]
     aesthetics$alpha <- rep(1, nrow(quantiles))
     both <- cbind(quantiles, aesthetics)
     quantile_grob <- GeomPath$draw_panel(both, ...)
     ggplot2:::ggname("geom_split_violin", grid::grobTree(GeomPolygon$draw_panel(newdata, ...), quantile_grob))
   }
   else {
     ggplot2:::ggname("geom_split_violin", GeomPolygon$draw_panel(newdata, ...))
   }
 })
 geom_split_violin <- function(mapping = NULL, data = NULL, stat = "ydensity", position = "identity", ..., 
                               draw_quantiles = NULL, trim = TRUE, scale = "area", na.rm = FALSE, 
                               show.legend = NA, inherit.aes = TRUE) {
   layer(data = data, mapping = mapping, stat = stat, geom = GeomSplitViolin, 
         position = position, show.legend = show.legend, inherit.aes = inherit.aes, 
         params = list(trim = trim, scale = scale, draw_quantiles = draw_quantiles, na.rm = na.rm, ...))
 }

My attempt to add boxplots and stats

Here is the code that I used to try to add:

  1. Split boxplots.

  2. P values using wilcox.test stats.

  3. Sample sizes (n).

Code:

 library(ggpubr)
 give.n <- function(x){return(y = -2.6, label = length(x))}
 ggplot(my_data, aes(x, y, fill = m)) + 
      geom_split_violin() + 
      geom_boxplot(width = 0.2, notch = TRUE, fill="white", outlier.shape = NA) + 
      stat_summary(fun.data = give.n, geom = "text") + 
      stat_compare_means(aes(label = ifelse(p < 1.e-4, sprintf("p = %2.1e", 
           as.numeric(..p.format..)), sprintf("p = %5.4f", 
           as.numeric(..p.format..)))), method = "wilcox.test", paired = FALSE) + 
      stat_summary(fun.data = give.n, geom = "text")

This is the result: enter image description here Unfortunately, this throws an error and is not quite where I hoped to get, because it is missing the p values and the sample sizes (n) and the boxplots are not split. I also tried one of @Axeman 's excellent suggestions in another SO answer, but no luck so far.

What I am hoping to achieve is something similar to this (also with p values no longer "NA"): enter image description here This seems a big challenge, but I am hoping someone out there might be able to help, as others will probably love this as much as me. Thank you.

Sylvia Rodriguez
  • 1,203
  • 2
  • 11
  • 30

0 Answers0