9

I'd like to align the bottom barplot in the following so that the groups line up vertically between the two plots:

par(mfrow = c(2, 1))
n = 1:5
barplot(-2:2, width = n, space = .2)

barplot(matrix(-10:9, nrow = 4L, ncol = 5L), beside = TRUE,
        width = rep(n/4, each = 5L), space = c(0, .8))

Two bar plots stacked vertically. Each shows 5 "groups" of bars -- in the top plot, each "group" is just a single bar, but in the bottom plot, each group is 4 bars. Since these groups correspond, we might expect them to align vertically across the two plots, but this is not the case.

I've been staring at the definition of the space and width arguments to barplot (from ?barplot) for a while and I really expected the above to work (but clearly it didn't):

width -- optional vector of bar widths. Re-cycled to length the number of bars drawn. Specifying a single value will have no visible effect...

space -- the amount of space (as a fraction of the average bar width) left before each bar. May be given as a single number or one number per bar. If height is a matrix and beside is TRUE, space may be specified by two numbers, where the first is the space between bars in the same group, and the second the space between the groups. If not given explicitly, it defaults to c(0,1) if height is a matrix and beside is TRUE, and to 0.2 otherwise.

As I read it, this means we should be able to match the group widths in the top plot by dividing each group into 4 (hence n/4). For space, since we're dividing each bar's width by 4, the average width will as well; hence we should multiply the fraction by 4 to compensate for this (hence space = c(0, 4*.2)).

However it appears this is being ignored. In fact, it seems all the boxes have the same width! In tinkering around, I've only been able to get the relative within-group widths to vary.

Will it be possible to accomplish what I've got in mind with barplot? If not, can someone say how to do this in e.g. ggplot2?

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
  • 1
    if i find time i'll dive into the source of `barplot`, i have a feeling the documentation is lying... – MichaelChirico May 03 '18 at 07:09
  • 1
    It seems like you only can have one `width` per row for the input matrix. `width` seems to use `nrow` values of the `width` vector, which are then recycled. The rest of the values are discarded. Start here: `barplot(matrix(1:6, nrow = 3L, ncol = 2L), beside = TRUE)`. Add `width` values, one per row: `barplot(matrix(1:6, nrow = 3L, ncol = 2L), beside = TRUE, width = c(1:3))` - recycled across columns (groups). Try with one `width` value per element (as you did): `barplot(matrix(1:6, nrow = 3L, ncol = 2L), beside = TRUE, width = c(1:3, 3:1))`. Nope, only the three first (`nrow`) are used. – Henrik May 03 '18 at 08:09
  • 1
    ...This recycling (and discard) rule means that to be able to create column specific `width`s, the data needs to be reshaped, so that a `width` can be assigned to each element, as nicely described by @Len. (just needed to clarify my previous (now deleted) a bit sloppy comment... ;) ) – Henrik May 03 '18 at 10:06

4 Answers4

12

It is possible to do this with base plot as well, but it helps to pass the matrix as a vector for the second plot. Subsequently, you need to realize the space argument is a fraction of the average bar width. I did it as follows:

par(mfrow = c(2, 1))
widthsbarplot1 <- 1:5
spacesbarplot1 <- c(0, rep(.2, 4))

barplot(-2:2, width = widthsbarplot1, space = spacesbarplot1)

widthsbarplot2 <- rep(widthsbarplot1/4, each = 4)
spacesbetweengroupsbarplot2 <- mean(widthsbarplot2)

allspacesbarplot2 <- c(rep(0,4), rep(c(spacesbetweengroupsbarplot2, rep(0,3)), 4))

matrix2 <- matrix(-10:9, nrow = 4L, ncol = 5L)

barplot(c(matrix2),
    width = widthsbarplot2,
    space = allspacesbarplot2,
    col = c("red", "yellow", "green", "blue"))

Base plot

Lennyy
  • 5,932
  • 2
  • 10
  • 23
  • Smart thinking! Just splay the matrix into a vector and fake the spacing... I need more sleep... i made a few edits to simplify the code a bit, hope you don't mind :) – MichaelChirico May 03 '18 at 09:24
  • No problem, it is more readable now indeed! And I hope this helps to get the desired plot of your real data as well. :) – Lennyy May 03 '18 at 09:30
6

You can actually pass widths in ggplot as vectors as well. You'll need the dev version of ggplot2 to get the correct dodging though:

library(dplyr)
library(ggplot2)

df1 <- data.frame(n = 1:5, y = -2:2)
df1$x <- cumsum(df1$n)
df2 <- data.frame(n = rep(1:5, each = 4), y2 = -10:9)
df2$id <- 1:4                                                    # just for the colors

df3 <- full_join(df1, df2)

p1 <- ggplot(df1, aes(x, y)) + geom_col(width = df1$n, col = 1)
p2 <- ggplot(df3, aes(x, y2, group = y2, fill = factor(id))) + 
  geom_col(width = df3$n, position = 'dodge2', col = 1) +
  scale_fill_grey(guide = 'none')

cowplot::plot_grid(p1, p2, ncol = 1, align = 'v')

enter image description here

Axeman
  • 32,068
  • 8
  • 81
  • 94
6

Another way, using only base R and still using barplot (not going "down" to rect) is to do it in several barplot calls, with add=TRUE, playing with space to put the groups of bars at the right place.

As already highlighted, the problem is that space is proportional to the mean of width. So you need to correct for that.

Here is my way:

# draw first barplot, getting back the value
bp <- barplot(-2:2, width = n, space = .2)

# get the xlim
x_rg <- par("usr")[1:2]

# plot the "frame"
plot(0, 0, type="n", axes=FALSE, xlab="", ylab="", xlim=x_rg, xaxs="i", ylim=range(as.vector(pr_bp2)))

# plot the groups of bars, one at a time, specifying space, with a correction according to width, so that each group start where it should
sapply(1:5, function(i) barplot(pr_bp2[, i, drop=FALSE], beside = TRUE, width = n[i]/4, space = c((bp[i, 1]-n[i]/2)/(n[i]/4), rep(0, 3)), add=TRUE))

enter image description here

Axeman
  • 32,068
  • 8
  • 81
  • 94
Cath
  • 23,906
  • 5
  • 52
  • 86
  • 1
    This also facilitates the following bells and whistles for adorning a similar plot further -- (1) if adding `arrows` and you'd like them to have different lengths for each group (to match the different bar widths), it's much easier in this loop (2) if each group should have a different within-group color scheme, it's trivial in separate `barplot` calls ([see also](https://stackoverflow.com/questions/31840378)) (3) if you'd like to have different `cex.names` under each group (again to match the widths), it's trivial in separate `barplot` calls – MichaelChirico May 04 '18 at 07:08
5

You can do this in ggplot2 by setting the x-axis locations of the bars explicitly and using geom_rect for plotting. Here's an example that's probably more complicated than it needs to be, but hopefully it will demonstrate the basic idea:

library(tidyverse)

sp = 0.4

d1 = data.frame(value=-2:2) %>% 
  mutate(key=paste0("V", 1:n()),
         width=1:n(),
         spacer = cumsum(rep(sp, n())) - sp,
         xpos = cumsum(width) - 0.5*width + spacer)

d2 = matrix(-10:9, nrow = 4L, ncol = 5L) %>% 
  as.tibble %>% 
  gather(key, value) %>%
  mutate(width = as.numeric(gsub("V","",key))) %>% 
  group_by(key) %>% 
  mutate(width = width/n()) %>% 
  ungroup %>% 
  mutate(spacer = rep(cumsum(rep(sp, length(unique(key)))) - sp, each=4),
         xpos = cumsum(width) - 0.5*width + spacer)

d = bind_rows(list(d1=d1, d2=d2), .id='source') %>% 
  group_by(source, key) %>% 
  mutate(group = LETTERS[1:n()])

ggplot(d, aes(fill=group, colour=group)) +
  geom_rect(aes(xmin=xpos-0.5*width, xmax=xpos+0.5*width, ymin=0, ymax=value)) +
  facet_grid(source ~ ., scales="free_y") +
  theme_bw() +
  guides(fill=FALSE, colour=FALSE) +
  scale_x_continuous(breaks = d1$xpos, labels=d1$key)

enter image description here

eipi10
  • 91,525
  • 24
  • 209
  • 285
  • 2
    i was hoping a `ggplot` solution would be more concise will take a closer look in a bit... for base, I know the (ultimate) backup would be to just build the plot with `rect` but i'd rather not go full tedium – MichaelChirico May 03 '18 at 06:49
  • 2
    Well, it's late. Maybe I'll wake up in morning with a more elegant answer! – eipi10 May 03 '18 at 06:51
  • To be fair, most of that code is to construct the data frame :) – neilfws May 03 '18 at 21:56