I am new to Tree structures. I am trying to design an algorithm which takes as input a predetermined tree. It should then seed the root of the tree with an object of my own design (I don't think it's relevant to the design of the algorithm, but the object is an ArrayList), and uses a "change operator" to modify the object along each branch of the tree. The number of passes of the change operator is dependent upon the branch length. The object must be evaluated at each node of the tree, and this node value is then used to evaluate the values of its child-nodes. Therefore, I cannot, simply evaluate the value at each tip using the total length from root to tip, as the change operator I am using is stochastic, rather than deterministic. Therefore the value of each child node is dependent upon the value of its parent node.
I have attempted to design this myself. I started by creating an array of times at which a single branch would split into its children.
int[] times = new int[branchnumber];
times[0] = 1;
times[1] = 5;
Then, I created a method which took the times at which branching should occur (which I interpreted as a branch length), the ArrayList object, the current time, and the total number of branches.
public static void Brancher(int[] times, List<double[][]> sequences, int t){
boolean checker = false;
for (int i = 0; i < times.length; i++) {
if (times[i] == t) {
checker = true;
}
}
if (checker == true) {
double[][] seq = sequences.get(sequences.size() - 1);
sequences.add(seq);
}
}
The Brancher method is implemented as below:
for (int j = 0; j <= loopnumber; j++) {
MathsOperators.Brancher(times, sequences, t);
for (int i = 0; i < sequences.size(); i++) {
double[][] sequenceholder = sequences.get(i);
MathsOperators.PrintOutput(sequenceholder, t);
ComplexInput.evolvesequence(sequenceholder, frequencies, transitionrate, transversionrate, random);
sequences.set(i, sequenceholder);
}
t = t + dt;
}
So, I hold the current state of all objects in the tree as Arrays held in the sequences ArrayList. These objects are then acted upon by the evolve method, and the updated objects replace the originals in the ArrayList. When a branch is to be added, the Brancher method takes the last Array object in the ArrayList, copies it, and adds the copy to the list. This has in effect simulated the splitting of the last object into two objects. The copy is then updated and evolved in the next iteration of the simulation loop.
This method of doing things, while not pretty, results in trees that look like this: http://content.science20.com/files/Tree3A.jpg . This is a relatively uncomplicated structure.
However, it is impossible to create a tree of this shape: http://content.science20.com/files/Tree4.jpg . This tree has multiple splits to the far right. I don't know the exact terminology to describe the differences between the trees here, but its reasonably obvious.
I think I am confusing myself here. Any advice about how to think about this problem would be greatly appreciated.
(If it helps with context, the input object is a genetic sequence (ACGT). The idea is to evolve the sequence along each branch of the tree, with each tip representing the descendant of the original ancestor (input) sequence)
EDIT
The input object is an ArrayList, containing multiple (n x 2) Arrays of doubles. The first column of each array contains n integers from the set {1,2,3,4}, with frequencies I control. However, the order of these characters in the column is random. Each integer represents one of the characters A,C,G,T in a DNA sequence. The second column contains a double representing the rate of evolution for each DNA site, so one rate entry for each integer entry. The initial Array is generated using a function I call DNASEQ, it's very long and doesn't make a difference to this question.
To start, I generate one of these Arrays, and add it to an ArrayList called sequences. Using the simulation loop shown above, I then apply the evolvesequence method to each Array in the ArrayList.
When Brancher detects that the time, which is updated after every loop in increments of 1, is equal to one of the times of branching specified, then it takes the last Array in the ArrayList and adds a copy of the array to the Arraylist. If the simulation stopped there, the Array would be identical to the one it was copied from. However, now that it is in the ArrayList, it is subject to the evolvesequence method. Since evolvesequence is a stochastic/probabilistic method, then for a given input, no two outputs are guaranteed to be identical. So, after a few iterations of the simulation loop, the copied Array will be very different from the original.