A phylo object in R can have internal node labels (phylo_obj$node.label
), but many R functions use node numbers instead of the node labels. Even the phylo object itself uses node numbers to describe the edges (phylo_obj$edge
) and does not seem to have a direct mapping of internal node labels to these node numbers used for phylo_obj$edge
. How do I map node labels (eg., "NodeA" or "Artiodactyla") to the node number (eg., 250 or 212)? I can't find any R functions or generally any docs on this.

- 743
- 1
- 6
- 16
-
Can you give an example of a function that you would like to use, but requires the node number? It would help if you provided a small reproducible example, maybe starting with `phy <- rtree(n=10)` – G5W Aug 05 '18 at 21:17
-
I believe that Thomas Guillerme has answered my question. There's some functions that require an integer specifying the internal node (eg., `phangorn::Descendants`), but I wasn't sure how the node integer IDs mapped to the the node labels (eg., 1 <--> mammalia; 2 <--> aves, etc). I don't want to use the wrong node integer and get the wrong descendants – sharchaea Aug 10 '18 at 08:35
2 Answers
Not exactly sure what is the objective here but if you want to select specific node numbers in the edge table and there equivalent in the node labels vector, you can simply use tree$node.label[node_number - Ntip(tree)]
.
In more details:
## Simulating a random tree
set.seed(1)
my_tree <- rtree(10)
my_tree$node.label <- paste0("node", seq(1:9))
## Method 1: selecting a node of interest (e.g. MRCA)
mrca_node <- getMRCA(my_tree, tip = c("t1", "t2"))
#[1] 16
mrca_node
is now the ID of the node in the edge table (in this case a number higher than 10). To select the equivalent node label you can simply select the number of tips from the mrca_node
:
## The node label for the mrca_node
my_tree$node.label[mrca_node-Ntip(my_tree)]
#[1] "node6"
Alternatively, you can select your node labels from the edge table
## Method 2: directly extracting the nodes from the edge tables
# Function selecting the tip or node name corresponding to the edge row
select.tip.or.node <- function(element, tree) {
ifelse(element < Ntip(tree)+1,
tree$tip.label[element],
tree$node.label[element-Ntip(tree)])
}
## Making the edge table
edge_table <- data.frame(
"parent" = my_tree$edge[,1],
"par.name" = sapply(my_tree$edge[,1],
select.tip.or.node,
tree = my_tree),
"child" = my_tree$edge[,2],
"chi.name" = sapply(my_tree$edge[,2],
select.tip.or.node,
tree = my_tree)
)
# parent par.name child chi.name
#1 11 node1 12 node2
#2 12 node2 1 t10
#3 12 node2 13 node3
#4 13 node3 2 t6
#5 13 node3 3 t9
#6 11 node1 14 node4
#7 14 node4 15 node5
#8 15 node5 16 node6
#9 16 node6 4 t1
#10 16 node6 17 node7
#11 17 node7 5 t2
#12 17 node7 6 t7
#13 15 node5 7 t3
#14 14 node4 18 node8
#15 18 node8 19 node9
#16 19 node9 8 t8
#17 19 node9 9 t4
#18 18 node8 10 t5

- 1,747
- 4
- 16
- 23
-
2Looking back at this years later, I still don't see why the `ape::phylo` object isn't more clear about how the `phylo$edge` matrix matches up to `phylo$tip.label`. It seems like only a small change in the code would be needed to add row names to the `phylo$edge` matrix that at least includes the tip labels. – sharchaea Sep 05 '21 at 14:28
The default, the tips are numbered from 1 to n, where n is the number of the tips. For example, the first tip in the phylo$tip.label
has the node number 1.
Then the internal nodes are further numbered. The specific node number can be found based on the edge in the phylo$edge
.

- 2,417
- 3
- 20
- 25