I believe the clue is in the documentation (?check.names
):
data.names: names of the tips in the order of the data; if this is not
given, names will be taken from the names or rownames of the
object data
If you want the program to return the names of the taxa that are included in the data frame but not present in the tree, you either need to assign the corresponding names as row names of your data frame, or specify them separately in the data.names
argument. Note that the default row names of a data frame are the character equivalent of the row number, exactly what you're seeing above ...
edit based on additional information above:
R can't guess (or doesn't want to) that the names are contained in the Family
element of your data frame. Try:
check.names(traitdata,tree,data.names=as.character(traitdata$Family))
Probably better in the long run to do:
rownames(traitdata) <- as.character(traitdata$Family)
traitdata <- subset(traitdata,-Family)
check.names(traitdata,tree)
Because you don't want to have Family
included in your data set of traits -- it's an identifier, not a trait ...
If you look at the structure of the example data given in the package:
data(geospiza)
geospiza.data
you can see that the taxon names are included as row names, not as a column in the data frame itself ...
PS it's not as nice an interface as StackOverflow, but there's a very friendly and active R-for-phylogeny mailing list at r-sig-phylo@r-projects.org
...