0

I have many subtrees saved in a file and I want to search them to find many things for each one of these subtrees like : the number of nodes, the number of leafs and the number of levels a subtree consist of...

To be more precise, the difference between a node and a leaf in my work; a node is any vertex in a subtree that could be a parent or a child where as a leaf is only a child vertex i.e every leaf is a node and the opposite is not true.

I am facing many problems in this work, the first one: the file containing the subtrees is not showing the rooted node and is not differentiating between parents and children..

The second problem: I read that for searching a tree programmers usually use a recursive method so I tried to search through the INTERNET for references or algorithms or pseudo-codes but all what I found is dealing with binary tree which is not my case (I am dealing with all configurations of a subtree) !!!

So could anyone kindly help me by giving a reference, an algorithm or an example for searching a tree to find the previous characteristics for such a subtree??

Another question: Is it possible to do this work using R ??

I will use any program to write the code but mainly I am interested in C.

Again,, please my subtree is not a binary one

UPDATE: Each subtree is represented in my file as a set of edges, You can see below an example of a subtree of size 4:

      44180 0 
      44180 18238
      44180 13362
      69677 44180

UPDATE: Sorry for the new update but Can I use R in my case even if there is a huge number of subtrees like 100000 subtrees each one with 20 edges (100000*20) ??

Noah16
  • 294
  • 2
  • 13

1 Answers1

0

A Tree is well defined.

I am going to assume that each file describes exactly one tree. It also seems reasonable to assume that the line 44180 0 means an edge from node 44180 to node 0. You should double check these assumptions.

With these assumptions, you can parse the file into the following data structure:

Node
    int id (id of self)
    Node* parent (pointer to parent Node)
    Node** children (list of children Nodes)

Or even simpler:

Node
    int id
    int parent (id of the parent node)
    int* children (ids of all children)

Once you have the entire file parsed into a list/array of Nodes, then pick ANY node, and recursively traverse up the parent node until you hit a node that has no parent. Then the final node must be the root of the tree.

Now that you have the pointer to the root Node, you should be able to apply other algorithms.

wookie919
  • 3,054
  • 24
  • 32
  • thanks for your comment and help...but my question how to start the subtree traversal using recursive method for a GENERAL subtree because all references I saw are dealing with binary tree to benefit from the fact we have a left and right nodes.. – Noah16 May 08 '17 at 11:49
  • @Noah16 Neither BFS nor DFS traversals (recursive or not) require that the tree is binary. What specific traversal algorithms are you looking at? – wookie919 May 08 '17 at 22:25
  • Maybe I am wrong because all example I saw about trees traversal were for a binary subtrees!! So what do you think the best algorithm I can use to get the number of (levels, leasf and nodes) of my subtree? – Noah16 May 09 '17 at 08:12
  • @Noah16 Both BFS and DFS can be used to retrieve all three. Note that you can get the number of nodes simply by parsing the input file. – wookie919 May 09 '17 at 22:02