0

I'm having problems converting a binary rooted tree to newick format. The full explanation for such a format can be found: http://code.google.com/p/mrsrf/wiki/NewickTree

An example of a newick format would be as follows:

for a tree T such as http://www.cs.mcgill.ca/~cs251/OldCourses/1997/topic8/images/completetreetwo.gif the newick representation would be: (((8,9),(10,11)),((12,13),(14,15)))

the internal node will become the commas while the leaves will be retained.

such trees have internal nodes which will always have 2 children.

I have a problem using recursion to come out with this newick format. The output contains far too many nodes and braces.

Any comments to resolve this problem is appreciated or even an iterative algorithm would be welcomed

import java.util.Stack;

public class Tree {

....

public String inOrderNewick(Node root, String output) throws ItemNotFoundException {
    if (root.hasChild()) {
        output += "(";
        output += inOrderNewick(root.child1, output);
        output += ",";
        output += inOrderNewick(root.child2, output);
        output += ")";
        return output;
    } else {
        return root.getSeq();
    }    

}

//edit: implemented the change as advised. but the desired output for a tree is ((S3,(S1,S2)),(S4,S5)) where the actual output is (((S3,((S3,(S1,S2)),(((S3,((S3,(S1,S2)),(S4,S5))

This tells me that there are logic errors. Perhaps there is a need to have flags?

Esmond
  • 71
  • 2
  • 9

2 Answers2

1

Maybe you have a flaw in understanding recursion. You don't need the "output" argument. When you compute a subtree you don't need the representation of the previous nodes. Make it like this:

public String inOrderNewick(Node root) throws ItemNotFoundException {
    if (root.hasChild()) {
        String output = "";
        output += "(";
        output += inOrderNewick(root.child1);
        output += ",";
        output += inOrderNewick(root.child2);
        output += ")";
        return output;
    } else {
        return root.getSeq();
    }
}
Petar Minchev
  • 46,889
  • 11
  • 103
  • 119
  • thanks, that did take the number of extra char down but there is still some logic error. the output i am getting for tree that should be (S3,(S1,S2)),(S4,S5)) is (((S3,((S3,(S1,S2)),(((S3,((S3,(S1,S2)),(S4,S5)) will i need some flags to handle this? – Esmond Apr 10 '10 at 08:54
  • Are you sure the tree was properly constructed? – Petar Minchev Apr 10 '10 at 09:06
  • Yea a call for the size of the tree returns 9 with 5 leaves and 4 internal nodes. An inorder traversal returns S3 r2 S1 r2 S2 r4 S4 r3 S5 – Esmond Apr 10 '10 at 09:13
  • Er, I just constructed the tree and the output is properly produced: http://pastie.org/912639 – Petar Minchev Apr 10 '10 at 09:17
0

The fixed code is working only for trees of 0 or 2 children per node.

This should work for arbitrary binary trees + adds branch distance parameter:

BinaryTree left;
BinaryTree right;
double ldist;
double rdist;
String label;

//...    

public String toNewick(){

        if(right==null && left==null){
          return label.toString();
        }

        String output = "(";
        if(right!=null){
          output+=right.toNewick()+":"+rdist;
        }
        if(right!=null && left!=null){
          output+=",";
        }
        if(left!=null){
          output+=left.toNewick()+":"+ldist;
        }
        output += ")";    

        return output;
}
Tomas F.
  • 610
  • 1
  • 6
  • 13