I would like to use the cutree() function to cluster a phylogenetic tree into a specified number of clades. However, the phylo object (an unrooted phylogenetic tree) is not unltrametric and thus returns an error when using as.hclust.phylo(). The goal is to sub-sample tips of a tree while retaining maximum diversity, hence the desire to cluster by a specified number of clades (and then randomly sample one from each clade). This will be done for a number of trees with varying numbers of desired samples. Any help in coercing the unrooted tree into an hclust object, or a suggestion as to a different method of systematically collapsing the trees (phylo objects) into a predefined number of clades would be greatly appreciated.
library("ape")
library("ade4")
tree <- rtree(n = 32)
tree.hclust <- as.hclust.phylo(tree)
Returns: "Error in as.hclust.phylo(tree) : the tree is not ultrametric"
If I make a distance matrix of the brach lengths between all nodes, I am able to use hclust to generate clusters and subsequently cutree into the desired number of clusters:
dm <- cophenetic.phylo(tree)
single <- hclust(as.dist(dm), method="single")
cutSingle <- as.data.frame(cutree(single, k=10))
color <- cutSingle[match(tree$tip.label, rownames(cutSingle)), 'cutree(single, k = 10)']
plot.phylo(tree, tip.color=color)
However, the results are not desirable because very basal branches get clustered together. Basing the clustering on the tree structure, or the tip to root distance would be more desirable.
Any suggestions are appreciated!