0

I'm trying to build a phylogenetic tree with bootstrapping using the methodology defined here. Plotting the name of each individual (named 1,2,3,4,... for the sake of this examples) is not informative since I have a HUGE data set. Therefore, I want the tips to be coloured to reflect the individual's species. The individual species information is stored in a separated matrix called species:

species = data.frame(Ind=c(1 ,2  ,3  ,4  ,5 ),
                     Spe=c("s1","se1","se2","se2","se3"))

Nevertheless, whenever I tried to replace the names of the individuals with its species in fit$tree$label, the only results is empting the vector. Probably this is because fit is an object of class pml, which my generate conflict with my strategy, but I'm not sure. Is there a way to colour the tips of the tree based on the individual species? My code looks like this, for now:

library(phangorn)
#Create a tree from data in fasta format
dat = read.phyDat(file = "myalignment.fasta", format ="fasta")
tree <- pratchet(dat)          # parsimony tree
mt <- modelTest(dat, tree=tree, multicore=TRUE)
mt[order(mt$AICc),]
bestmodel <- mt$Model[which.min(mt$AICc)]
env = attr(mt, "env")
fitStart = eval(get("GTR+G+I", env), env)
fit = optim.pml(fitStart, rearrangement = "stochastic", optGamma=TRUE, optInv=TRUE, model="GTR")
bs = bootstrap.pml(fit, bs=100, optNni=TRUE, multicore=TRUE)

#Replace the names with the species...
fit$tree$tip.label <- species[which(species[,1] == fit$tree$tip.label),2]
#If I print fit$tree$tip.label here, the output is factor(0)

#...and create the tree with colored tips
plotBS(midpoint(fit$tree), bs, p = 50, type="p", show.tip.label = FALSE)
tiplabels(pch=19, col = as.factor(fit$tree$tip.label), adj = 2.5, cex = 2)

REPRODUCIBLE EXAMPLE

Save the following with the name "myalignment.fasta" and run the code above. It should create a toy example to play with:

>1
AACCAGGAGAAAATTAA
>2
AAAAA---GAAAATTAA
>3
ACACAGGAGAAAATTAA
>4
AACCTTGAGAAAATTAT
>5
CCTGAGGAGAAAATTAA
j91
  • 451
  • 1
  • 4
  • 14
  • You should create a minimal [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Include sample data (we can't read files on your local machine). If we can copy/paste the code into R, it is easier to test and to help. If the question is about plotting, remove any code not relevant to the question. – MrFlick Sep 30 '16 at 14:38
  • Often package function include sample code that use built in data sets to demonstrate how functions work. Also `dput()` can be used to re-create variables as per the link I included. You can create minimal data to use in the question itself. I only suggest these things if you want to make it easier for people to help you. – MrFlick Sep 30 '16 at 14:49
  • @MrFlick I created a minimal "myalignment.fasta" that you can use to reproduce the code. – j91 Sep 30 '16 at 14:57

1 Answers1

2

So if your problem is list plotting points as colors rather than labels, you can do

tiplabels(pch=19, cex=2, 
    col = species$Spe[match(fit$tree$tip.label, species$Ind)])

I do a bit of gynmastics just to make sure the labels match up to the species (though they were in the same order in this example).

enter image description here

MrFlick
  • 195,160
  • 17
  • 277
  • 295