3

I would like to modify plots produced by the phyloseq package (download it from github). Phyloseq plots are ggplot2 objects, so I would think that I could add elements by adding ggplot2 objects to to the phyloseq-created object. In some cases this works, but not in others, and I don’t understand why. For example:

require(phyloseq)
require(grid)
require(ggplot2)
require(plyr)
#use the GlobalPatterns dataset from the Phyloseq package
GP <- GlobalPatterns
#do some preprocessing to the data
wh0 <- genefilter_sample(GP, filterfun_sample(function(x) x > 5), A = 0.5 * nsamples(GP))
GP1 <- prune_taxa(wh0, GP)
GP1 <- transform_sample_counts(GP1, function(x) 1e+06 * x/sum(x))
phylum.sum = tapply(taxa_sums(GP1), tax_table(GP1)[, "Phylum"], sum, na.rm = TRUE)
top5phyla = names(sort(phylum.sum, TRUE))[1:5]
GP1 <- prune_taxa((tax_table(GP1)[, "Phylum"] %in% top5phyla), GP1)
#ordination for NMDS plot using a Bray-Curtis distance
GP.ord <- ordinate(GP1, "NMDS", "bray") 
#create plot
p3 <- plot_ordination(GP1, GP.ord, type = "biplot", color = "SampleType", shape = "Phylum", title = "biplot")

Now I will attempt to add some envfit() arrows to the plot from the package vegan, see previous question here:

require(vegan)    
# First, lets apply envfit to the human/not human variable
    human = get_variable(GP1, "SampleType") %in% c("Feces", "Mock", "Skin", "Tongue")
    sample_data(GP1)$human <- factor(human)

    nmds.envfit <- envfit(GP.ord$points, env = as.data.frame(sample_data(GP1)$human), perm = 999) #standard envfit
    str(nmds.envfit)

    #data for the envfit arrows
    vec.sp.df<-as.data.frame(cbind((nmds.envfit$factors$centroids*sqrt(nmds.envfit$factors$r)),pvals=nmds.envfit$factors$pvals)) #this is necessary, see Gavin Simpson in the link provided above
    env.scores.nmds <- as.data.frame(vec.sp.df[vec.sp.df$pvals<0.05,]) #extracts relevant scores from envifit
    #extracts relevant scores from envifit
    env.scores.nmds <- cbind(env.scores.nmds, env.variables = c("Not Human", "Human")) #and then gives them their names
    env.scores.nmds

mult<- 1  #can change this if the arrows need to be a different length
###Now let us add these vectors to p3
p3 + geom_segment(data = env.scores.nmds,
                   aes(x = 0, xend = mult*MDS1, y = 0, yend = mult*MDS2),
                   arrow = arrow(length = unit(0.75, "cm")), colour = "black") + #arrows for envfit.  doubled the length for similarity to the plot() function. NB check ?envfit regarding arrow length if not familiar with lengths
      geom_text(data = env.scores.nmds,   #labels the environmental variable arrows * "mult" as for the arrows
                aes(x = mult*MDS1, y = mult*MDS2, label=env.variables),
                size = 6,
                hjust = -0.5) 

However, this returns an error: "Error in eval(expr, envir, enclos) : object 'id.type' not found"

If we try adding another type of ggplot2 element, it will work:

p3+ geom_hline(yintercept=0.75)
www
  • 38,575
  • 12
  • 48
  • 84
K. Brannen
  • 145
  • 1
  • 11

1 Answers1

0

The error message already lets you know what you need to fix in your added layer.

The ggplot2 object you want to modify has a column variable, id.type, in either p3$data or in the $data slot of one of its layers, and this is an aesthetic mapping argument that is passed to your new layer implicitly if you haven't overwritten it. Given that your added layer specifies x and y in both cases, I suspect that id.type is a faceting or color variable. In recent versions of ggplot2 you can include an argument inherit.aes=FALSE so that this inherited mapping is avoided, and in which case you'll be missing the unspecified mapping. The consequence is different depending on what it was (e.g. if facet, then layer appears in both panels, I think; and if color, then the layer gets assigned default color).

Alternatively, you can add an id.type column to the data for your new layers. It depends on what result you want to achieve.

Paul 'Joey' McMurdie
  • 7,295
  • 5
  • 37
  • 41