3

The type of plot I am trying to achieve in R seems to have been known as either as moving distribution, as joy plot or as ridgeline plot:

joy-plot

There is already a question in Stackoverflow whose recorded answer explains how to do it using ggplot: How to reproduce this moving distribution plot with R?

However, for learning purposes, I am trying to achieve the same using only base R plots (no lattice, no ggplot, no any plotting package).

In order to get started, I generated the following fake data to play with:

set.seed(2020)
shapes <- c(0.1, 0.5, 1, 2, 4, 5, 6)
dat <- lapply(shapes, function(x) rbeta(1000, x, x))
names(dat) <- letters[1:length(shapes)]

Then using mfrow I can achieve this:

par(mfrow=c(length(shapes), 1))
par(mar=c(1, 5, 1, 1))
for(i in 1:length(shapes))
{
    values <- density(dat[[names(dat)[i]]])
    plot(NA,
         xlim=c(min(values$x), max(values$x)),
         ylim=c(min(values$y), max(values$y)),
         axes=FALSE,
         main="",
         xlab="",
         ylab=letters[i])
    polygon(values, col="light blue")
}

The result I get is:

r-device-plot

Clearly, using mfrow (or even layout) here is not flexible enough and also does allow for the overlaps between the distributions.

Then, the question: how can I reproduce that type of plot using only base R plotting functions?

Marcus Campbell
  • 2,746
  • 4
  • 22
  • 36
ZXiu
  • 33
  • 2

2 Answers2

3

Here's a base R solution. First, we calculate all the density values and then manually offset off the y axis

vals <- Map(function(x, g, i) {
  with(density(x), data.frame(x,y=y+(i-1), g))
}, dat, names(dat), seq_along(dat))

Then, to plot, we calculate the overall range, draw an empty plot, and the draw the densities (in reverse so they stack)

xrange <- range(unlist(lapply(vals, function(d) range(d$x))))
yrange <- range(unlist(lapply(vals, function(d) range(d$y))))
plot(0,0, type="n", xlim=xrange, ylim=yrange, yaxt="n", ylab="", xlab="Value")
for(d in rev(vals)) {
  with(d, polygon(x, y, col="light blue"))
}
axis(2, at=seq_along(dat)-1, names(dat))

enter image description here

MrFlick
  • 195,160
  • 17
  • 277
  • 295
0
d = lapply(dat, function(x){
    tmp = density(x)
    data.frame(x = tmp$x, y = tmp$y)
})

d = lapply(seq_along(d), function(i){
    tmp = d[[i]]
    tmp$grp = names(d)[i]
    tmp
})

d = do.call(rbind, d)

grp = unique(d$grp)
n = length(grp)

spcx = 5
spcy = 3

rx = range(d$x)
ry = range(d$y)

rx[2] = rx[2] + n/spcx
ry[2] = ry[2] + n/spcy

graphics.off()
plot(1, type = "n", xlim = rx, ylim = ry, axes = FALSE, ann = FALSE)

lapply(seq_along(grp), function(i){
    x = grp[i]
    abline(h = (n - i)/spcy, col = "grey")
    axis(2, at = (n - i)/spcy, labels = grp[i])
    polygon(d$x[d$grp == x] + (n - i)/spcx,
            d$y[d$grp == x] + (n - i)/spcy,
            col = rgb(0.5, 0.5, 0.5, 0.5))
})

enter image description here

d.b
  • 32,245
  • 6
  • 36
  • 77