2

I am looking for a way in R to convert a phylogenetic tree (Newick format or class "phylo") into a dataframe. The goal is to get a nice overview of which tips descent from every node of the tree. Does anyone have experience with this problem?

I can get all tip labels from a phylogenetic tree, or get all descendant nodes from a node, but I don't manage to get the tip labels that belong to a certain node.

#for a random tree x
x <- rtree(10, tip.label = LETTERS[1:10])

#get all tip labels by asking for tree information
> x

Phylogenetic tree with 10 tips and 9 internal nodes.

Tip labels:
    H, G, D, B, I, C, ...

Rooted; includes branch lengths.

#descendant nodes from a node
test <- phytools::getDescendants(x, node=5, curr=NULL)

#the package ggphylo seemed to have the answer to my problem, but it is no longer supported (last updates were in 2012)
ggphylo::tree.as.data.frame(x)

(I think conversion to a dataframe is the easiest way but if you know another approach to get descendant tips from a node, I am open to every possible solution)

Thomas Guillerme
  • 1,747
  • 4
  • 16
  • 23
kara
  • 21
  • 2

3 Answers3

0

Are you looking for an edge table (i.e. the table that displays the connection between each tip and node)? You can access it directly in the phylo object using:

## The edge table
x$edge

You can find a fancier visualisation of it in the following SO question.

Thomas Guillerme
  • 1,747
  • 4
  • 16
  • 23
0

The function compute.mr from the phytools package converts a phylogeny to its matrix representation, which might be what you look for.

x <- rtree(10, tip.label = LETTERS[1:10])
phytools::compute.mr(x, type = "matrix")
#   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# H "1"  "1"  "1"  "0"  "0"  "0"  "0"  "0" 
# E "1"  "1"  "1"  "0"  "0"  "0"  "0"  "0" 
# C "1"  "1"  "0"  "1"  "1"  "0"  "0"  "0" 
# I "1"  "1"  "0"  "1"  "1"  "0"  "0"  "0" 
# J "1"  "1"  "0"  "1"  "0"  "0"  "0"  "0" 
# F "1"  "0"  "0"  "0"  "0"  "1"  "0"  "0" 
# A "1"  "0"  "0"  "0"  "0"  "1"  "0"  "0" 
# D "0"  "0"  "0"  "0"  "0"  "0"  "1"  "0" 
# G "0"  "0"  "0"  "0"  "0"  "0"  "1"  "1" 
# B "0"  "0"  "0"  "0"  "0"  "0"  "1"  "1" 
nya
  • 2,138
  • 15
  • 29
0

Thank you for the suggestions. I found a colleague with the answer I was looking for. The function prop.part from the ape package gives a nice overview of all tips per node. The result is a list rather than a data frame but it does the trick.

list_all_tips <- c(prop.part(tree, check.labels = TRUE)) 
kangaroo_cliff
  • 6,067
  • 3
  • 29
  • 42
kara
  • 21
  • 2