0

I want to represent the density of a number of variables as you would in a boxplot, violin plot, or beeswarm. But in this case, each variable would be a band with the density displayed as a gradient along the bar.

Hopefully I don't need to manually draw bars as filled shapes.

Imagine if instead of the violins or boxplots, there was a bar with a gradient representing the density.

enter image description here

library(tidyverse)
library(ggplot)

df = data.frame(
  A = 2.3 + 7*rnorm(100),
  B = 0 + 5*rnorm(100),
  C = 4 + 2*rnorm(100)
)

df %>%
  gather() %>%
  ggplot(aes(x=key, y=value)) + 
  geom_violin(scale="width", fill='red', alpha=0.5) + 
  geom_boxplot(fill='green', alpha=0.5)
abalter
  • 9,663
  • 17
  • 90
  • 145

1 Answers1

2

So this is my closest approximation of what I got from your question:

# Dummy data
df <- data.frame(
  y = c(rnorm(100, 4), rnorm(100, 12)),
  x = rep(c(1, 2), each = 100)
)

ggplot(df, aes(x, y, group = x)) +
  # To fill gap between 0 and actual data
  stat_summary(geom = "rect",
               fun.ymin = function(x){0},
               fun.ymax = min,
               aes(xmin = x - 0.4, xmax = x + 0.4, fill = 0)) +
  # To make the density part
  stat_ydensity(aes(fill = stat(density), 
                    xmin = x - 0.4, xmax = x + 0.4,
                    # Nudge y by a bit depending on the spread of your data
                    ymin = stat(y) - 0.01, ymax = stat(y) + 0.01), 
                geom = "rect", trim = FALSE)

enter image description here

Does that fit the bill?

teunbrand
  • 33,645
  • 4
  • 37
  • 63
  • I think that will do it. I personally would set up the data and `aes` in a different way--more like the example I added. But I'm pretty sure I can use the `stat_ydensity` in the same way. BTW, what are the +/- 0.4 and +/- 0.1 for? – abalter Oct 24 '19 at 17:47
  • 1
    Since `geom_rect` is parameterised as xmin, xmax, ymin and ymax and our (computed) data gives x/y parameterised (centered) data, we need to increase the width of the bar to a sensible number with xmin/xmax and the height of each band with the ymin/ymax. `geom_tile` would work with simple x/y parameterised data, but when I tried that, it didn't quite fill gaps in one of the bars since it tries to make regular heights for each tile, which doesn't match elegantly with the computed y-values. – teunbrand Oct 24 '19 at 17:51