16

Can I make such a plot with ggplot2?

# data
require(vegan)
data(dune)
data(dune.env)

# RDA
RDA <- rda(dune ~ A1, data = dune.env)

# extract species scores
df <- data.frame(spec_scores = scores(RDA)$species[ , 1], 
                 taxa = rownames(scores(RDA)$species))
df <- df[abs(df$spec_scores) > 0.05, ]

# plot
par(mar = c(5,4,4,8))
# boxplot of sites-scores along A1-axis
boxplot(scores(RDA)$sites[ , 1] ~ dune.env$Management)
abline(h = 0, lty = "dotted")

# add species scores to plot
rug(df$spec_scores, side=4)
linestack(df$spec_scores, labels=df$taxa, at = par("usr")[2], add = TRUE, hoff = 1)

enter image description here

Basicly I am looking for a way how to plot a labelled rug beneath the boxplot.

Any hints or suggestions?

EDi
  • 13,160
  • 2
  • 48
  • 57
  • I reckon that will need some specific grid-fu arrange for the text to be placed outside the plot region and line segments joining the labels with the rug ticks. If you do think of something or get responses I'll be interested in them as I have just taken the first tentative steps along the path to a package **ggvegan** which will hopefully (eventually) provide **ggplot** versions of all the plotting functions in **vegan**. – Gavin Simpson Nov 23 '12 at 16:08
  • 1
    @Gavin Simpson : Nice idea! I have also worked a little bit on plotting ordinations with ggplot2. There is also phyloseq on github. Is your project on github (for collaboration)? – EDi Nov 23 '12 at 16:19
  • Not yet, but will be soonish. I'm going to pop this one on github and sync to r-forge. Any and all contributions will be most gratefully received. Those tentative steps were mainly in my head and some exploratory code investigating `autoplot()`. Need some brainstorming as well - just produce the plot or return a useful object? Etc. – Gavin Simpson Nov 23 '12 at 16:40

2 Answers2

16

This is solution only using ggplot2, so it won't be most elegant one.

I started with first steps as suggested by @Eric Fail.

require(vegan)
data(dune)
data(dune.env)

# RDA
RDA <- rda(dune ~ A1, data = dune.env)

# extract species scores
df1 <- data.frame(RDA.scores = scores(RDA)$sites[ , 1], Management = dune.env$Management)
df2 <- data.frame(y = scores(RDA)$species[ , 1], taxa = rownames(scores(RDA)$species))
# Order data according to species scores
df2<-df2[order(df2$y),]

# define values for rugs (segments). 4.9 and 5.0 used because data has 4 levels (+1)
df2$x=4.9
df2$xend=5

# define coordinates for names (30 is number of species)
df2$yend2<-seq(min(df1$RDA.scores),max(df1$RDA.scores),length.out=30)
df2$xend2=5.3

# plot
P <- ggplot(data = df1, aes( x = Management, y = RDA.scores) ) + 
     geom_segment(y=0,yend=0,x=0,xend=5, lty=2, size = I(0.3)) + 
     geom_boxplot() + 
     #add extra levels to get space
     scale_x_discrete(limits=c("BF", "HF", "NM", "SF", "","")) + 
     #set y scale
     scale_y_continuous(limits=c(-3,5),expand=c(0,0)) +
     #add rugs as segments and add segments connecting rugs and texts
     geom_segment(data= df2, mapping=aes(x=x,xend=xend,y=y,yend=y), size = I(0.3)) +
     geom_segment(data= df2, mapping=aes(x=xend,xend=xend2,y=y,yend=yend2), size = I(0.4)) +
     #add texts
     geom_text(data=df2,mapping=aes(x=xend2+0.1,y=yend2,label=taxa),hjust=0, size=3) +
     #add rectangular to imitate box around plot
     geom_rect(xmax=5,xmin=0.4,ymax=5,ymin=-3,colour="black",fill=NA)

# Final adjustments of plot
P+theme(axis.line=element_blank(),
        panel.grid=element_blank(),
        panel.background=element_blank())

ggplot2 boxplot

Eric Fail
  • 8,191
  • 8
  • 72
  • 128
Didzis Elferts
  • 95,661
  • 14
  • 264
  • 201
4

I have started to work on a solution, but I'm kinda stuck. I'm posting it in the hope that someone of the hardcore data visualization people can contribute to a real solution (if this is not the way to do it please let me know and I'll delete this half-baked project).

# data
require(vegan)
data(dune)
data(dune.env)

# RDA
RDA <- rda(dune ~ A1, data = dune.env)

# extract species scores
df1 <- data.frame(RDA.scores = scores(RDA)$sites[ , 1], Management = dune.env$Management)
df2 <- data.frame(spec_scores = scores(RDA)$species[ , 1], taxa = rownames(scores(RDA)$species))
df2 <- df2[abs(df2$spec_scores) > 0.05, ]
df2 <- cbind(df2, x = rep(1:4, 6))

# plot
P <- ggplot(data = df1, aes( x = Management, y = RDA.scores) ) + geom_boxplot() + theme_bw()  +geom_hline(y=0, lty=2) + geom_rug(data= df2, mapping=aes(x = x , y = spec_scores), sides="r") + geom_text(data= df2, mapping=aes(x=5, y=spec_scores, label=taxa), size=2) + xlim(c("BF", "HF", "NM", "SF", ""))
P

half-baked project plot

Eric Fail
  • 8,191
  • 8
  • 72
  • 128