9

I have a data that I'm plotting on ggplot2 as boxplots which look like

> head(varf)
             sID variable       value
1 SP_SA036,SA040   CM0001 0.492537313
2 SP_SA036,SA040   CM0001 0.479564033
3 SP_SA036,SA040   CM0001 0.559139785
4 SP_SA036,SA040   CM0001 0.526806527
5 SP_SA036,SA040   CM0001 0.009049774
6 SP_SA036,SA040   CM0001 0.451612903

The variable column contains 16 different IDs (from CM0001 to CM0016)

I have a dataframe with annotation

category   annotation
CM001      HG4450
CM002      HG3288
..
CM016      MM8998

I would like to map these annotations on top of my boxplots but couldn't find a way to do it, what is the right syntax of using geom_text with boxplot ?

Thanks

Hack-R
  • 22,422
  • 14
  • 75
  • 131
Rad
  • 989
  • 3
  • 14
  • 31

2 Answers2

9

There are many ways to approach this problem, e.g. here and here. Probably the simplest way is

meds <- c(by(mtcars$mpg, mtcars$cyl, median))
ggplot(mtcars, aes(factor(cyl), mpg)) +
    geom_boxplot() + 
    geom_text(data=data.frame(), aes(x=names(meds), y=meds, label=1:3), col='red', size=10)

enter image description here

Community
  • 1
  • 1
tonytonov
  • 25,060
  • 16
  • 82
  • 98
  • 2
    Only problem is that with geom_text() we will end up overwriting the same value over and over again because I have a very large dataframe not only 3 labels – Rad Apr 16 '14 at 18:25
  • Are you sure about that? I don't see any overwriting here: `length(meds)` is exactly the number of your categories, so each label will be drawn only once. – tonytonov Apr 16 '14 at 18:29
4
varf <- read.table(text = "sID variable       value
SP_SA036,SA040   CM0001 0.492537313
SP_SA036,SA040   CM0001 0.479564033
SP_SA036,SA040   CM0001 0.559139785
SP_SA036,SA040   CM0002 0.526806527
SP_SA036,SA040   CM0002 0.009049774
SP_SA036,SA040   CM0002 0.451612903", header = T)

anot <- read.table(text = "category   annotation
CM0001      HG4450
CM0002      HG3288", header = T)

varf <- merge(varf, anot, by.x = "variable", by.y = "category", all.x = T)

library(data.table)
quants <- data.table(varf)[, list(quant = as.numeric(quantile(value)[3])), by = variable]
ggplot(varf, aes(x = variable, y = value, fill = variable)) + 
  geom_boxplot() +
  geom_text(data = quants, aes(x = variable, y = quant, label = variable), size = 10)

enter image description here

David Arenburg
  • 91,361
  • 17
  • 137
  • 196