Create sample data. Note use of stringsAsFactors
here, I'm assuming your data are characters and not factors:
> d <- data.frame(list("c" = c("a", "b", "c", "d", "e", "f"), "p1" = c(NA, NA, "a", "b", "b", "d"), "p2" = c(NA, NA, NA, "c", "c", "e")),stringsAsFactors=FALSE)
First tidy it up - make the data long, not wide, with each row being a child-parent pair:
> pairs = subset(reshape2::melt(d,id.vars="c",value.name="parent"), !is.na(parent))[,c("c","parent")]
> pairs
c parent
3 c a
4 d b
5 e b
6 f d
10 d c
11 e c
12 f e
Now we can make a graph of the parent-child relationships. This is a directed graph, so plots child-parent as an arrow:
> g = graph.data.frame(pairs)
> plot(g)

Now I'm not sure exactly what you want, but igraph
functions can do anything... So for example, here's a search of the graph starting at d
from which we can get various bits of information:
> d_search = bfs(g,"d",neimode="out", unreachable=FALSE, order=TRUE, dist=TRUE)
First, which nodes are ancestors of d
? Its the ones that can be reached from d
via the exhaustive (here, breadth-first) search:
> d_search$order
+ 6/6 vertices, named:
[1] d c b a <NA> <NA>
Note it includes d
as well. Trivial enough to drop from this list. That gives you the set of ancestors of d
which is what you asked for.
What is the relationship of those nodes to d
?
> d_search$dist
c d e f a b
1 0 NaN NaN 2 1
We see that e
and f
are unreachable, so are not ancestors of d
. c
and b
are direct parents, and a
is a grandparent. You can check this from the graph.
You can also get all the paths from any child upwards using functions like shortest_paths
and so on.