2

I have something like the following:

x <- 1:5
y <- 2:6

A <- matrix(NA,nrow=100,ncol=5)
for(i in 1:5){A[,i] <- rnorm(100,x[i],y[i])}

B <- matrix(NA,nrow=100,ncol=5)
for(i in 1:5){B[,i] <- runif(100,min=x[i],max=y[i])}

The following command creates a boxplot for the 5 columns of matrix A:

boxplot(A[,1:5])

What I would like to do now is to have a boxplot like this, where each boxplot of a column of A is plotted next to a boxplot of the corresponding column of B. The boxplots should be directly next to each other, and between pairs of boxplots of the columns 1 to 5 there should be a small distance.

Thanks in advance!

Jaap
  • 81,064
  • 34
  • 182
  • 193
  • possible duplicate of [Plot multiple boxplot in one graph](http://stackoverflow.com/questions/14604439/plot-multiple-boxplot-in-one-graph) –  Sep 16 '14 at 08:42
  • You could try: `library(reshape2);d2 <- cbind(melt(A), value2=melt(B)[,3]); library(ggplot2); dM <- melt(d2, id.var=c("Var1", "Var2")); dM$Var2 <- factor(dM$Var2); ggplot(dM, aes(x=Var2, y=value, fill=variable))+ geom_boxplot()+scale_fill_manual(values=c("blue", "red"))` – akrun Sep 16 '14 at 08:43

2 Answers2

4

Bind your matrices together columnwise, inserting NA columns:

C <- cbind(A[,1],B[,1])
for ( ii in 2:5 ) C <- cbind(C,NA,A[,ii],B[,ii])

(Yes, this is certainly not the most elegant way - but probably the simplest and easiest to understand.)

Then boxplot and add axis labels:

boxplot(C,xaxt="n")
axis(1,at=1+3*(0:4),labels=rep("A",5),tick=FALSE)
axis(1,at=2+3*(0:4),labels=rep("B",5),tick=FALSE)
axis(1,at=1.5+3*(0:4),labels=1:5,line=2,tick=FALSE)

boxplot

Stephan Kolassa
  • 7,953
  • 2
  • 28
  • 48
3

An implementation with dplyr and tidyr:

# needed libraries
library(dplyr)
library(tidyr)
library(ggplot2)

# converting to dataframes
Aa <- as.data.frame(A)
Bb <- as.data.frame(B)

# melting the dataframes & creating a 'set' variable
mA <- Aa %>% gather(var,value) %>% mutate(set="A")
mB <- Bb %>% gather(var,value) %>% mutate(set="B")

# combining them into one dataframe
AB <- rbind(mA,mB)

# creating the plot
ggplot(AB, aes(x=var, y=value, fill=set)) +
  geom_boxplot() +
  theme_bw()

which gives: enter image description here


EDIT: To change the order of the boxes, you can use:

ggplot(AB, aes(x=var, y=value, fill=factor(set, levels=c("B","A")))) +
  geom_boxplot() +
  theme_bw()

which gives: enter image description here

Jaap
  • 81,064
  • 34
  • 182
  • 193
  • Great solution, thanks. One more question on the plot: Assume I name the sets as "Set A" and "Set B", then in the plot, Set A is always plotted before set B. Is there a way to change that order of the boxplots? – AuditorOfReality Sep 17 '14 at 12:04
  • Sorry that I keep asking stupid questions, but is there a more "refined" way to change the order? For example, what if I have "A", "B", "C", as values of set, that should be displayed in the order "A", "c", "B"? – AuditorOfReality Sep 17 '14 at 12:52
  • @AuditorOfReality see the update, with `factor(set, levels=c("B","A"))` you can set the order of the factor-levels – Jaap Sep 17 '14 at 13:14
  • I am sorry, but I have to keep asking questions for which I cannot find answers via google. Is there a simple way to change the color of the lines that constitute the borders of the boxplots so that they match the filling color? Adding a `colour = set` option to `aes()` removes the bar in the middle that indicates the mean value. – AuditorOfReality Sep 18 '14 at 14:43
  • As far as I know, you can't specifically set the color of the middle line (unless you really want to start hacking). What might be an idea is to use `notch=TRUE`to indicate that point in the boxplot. – Jaap Sep 18 '14 at 14:59