46

I'm working on putting together a problem set for an intro-level CS course and came up with a question that, on the surface, seems very simple:

You are given a list of people with the names of their parents, their birth dates, and their death dates. You are interested in finding out who, at some point in their lifetime, was a parent, a grandparent, a great-grandparent, etc. Devise an algorithm to label each person with this information as an integer (0 means the person never had a child, 1 means that the person was a parent, 2 means that the person was a grandparent, etc.)

For simplicity, you can assume that the family graph is a DAG whose undirected version is a tree.

The interesting challenge here is that you can't just look at the shape of the tree to determine this information. For example, I have 8 great-great-grandparents, but since none of them were alive when I was born, in their lifetimes none of them were great-great-grandparents.

The best algorithm I can come up with for this problem runs in time O(n2), where n is the number of people. The idea is simple - start a DFS from each person, finding the furthest descendant down in the family tree that was born before that person's death date. However, I'm pretty sure that this is not the optimal solution to the problem. For example, if the graph is just two parents and their n children, then the problem can be solved trivially in O(n). What I'm hoping for is some algorithm that is either beats O(n2) or whose runtime is parameterized over the shape of the graph that makes it fast for wide graphs with a graceful degradation to O(n2) in the worst-case.

GEOCHET
  • 21,119
  • 15
  • 74
  • 98
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • 3
    I'm not sure how it could be a strict tree, since there may be multiple people at each level (at the child level, brothers/sisters, at the parent level, multiple parents, maternal/paternal parents, etc). If you are simplifying the problem to ignore this, you need to reveal whether the oldest or youngest is the single root. Either that or don't pretend it is a tree and instead is just a graph with no privileged root. – Seth Robertson May 23 '11 at 19:50
  • 1
    Sorry, let me clarify - I'm using the term "family tree" here not in a technical sense, but as it's colloquially used in English. The graph is more properly a DAG. However, I still think that the undirected version of the graph is a tree in that it has no cycles. Am I incorrect about this? – templatetypedef May 23 '11 at 20:04
  • I'm not so sure that your algorithm is O(n^2) as you mention, since it seems that the population of people are not all related to each other - i.e. rather than one giant family tree, you'll have many distinct family trees - and that the furthest you will need to search from any given person in the tree through their descendants is only a handful of generations (certainly < 10), considering the typical lifespan of a human and the age at which people have children. – matt b May 23 '11 at 20:06
  • No, but when talking about a family tree, in many cases there can be a privileged node forming a traditional tree (even in the DAG case). When using tree in multiple senses, if you are talking about a theoretical tree you probably should clarify. – Seth Robertson May 23 '11 at 20:09
  • @matt b- I agree with everything you say, but mathematically the worst case is still O(n^2). In practice it's almost certainly faster. – templatetypedef May 23 '11 at 20:09
  • @templatetypedef - the undirected version will have many cycles. Just consider a couple that have two children, Alice and Bob. You can follow this path: Alice => Father => Bob => Mother => Alice. But if you enforce the direction, then there cannot be cycles. – Aaron McDaid May 23 '11 at 21:49
  • Isn't this simply a topological sorting of a DAG while keeping track of the maximum levels of descendants you had in your lifetime? – biziclop May 23 '11 at 23:11
  • @biziclop- Not quite, since even if you are three levels above some other node, you might not be a great-grandparent if you had died before that other person was born. – templatetypedef May 24 '11 at 00:00
  • @templatetypedef: Your solution is similar to what I and @Aaron McDaid came up with. But ours is somewhat more efficient if the population is rapidly expanding because your DFS has to search all descendants until your death, while in ours the upwards updates stop when there isn't an improvement. So if each generation has 6 kids, that each have 6 kids, then in yours the grandparents have to search all 36 grandkids while in ours only 6 of the grandkids try to update the grandparents. In practice this is likely to only be a small constant factor improvement. – btilly May 24 '11 at 00:21
  • Currently, all three answers have downvotes! But I'm pretty sure @btilly's is right. It's a pity that such an interesting question on StackOverflow doesn't have a well-upvoted answer. – Aaron McDaid May 24 '11 at 13:57
  • @templatetypedef Nice question! In the general case I doubt that there is any solution below O(n^2), because the number of possible parent-child relationships is O(n^2), and it seems to me that any algorithm would have to look at each relationship at least once. – mitchus May 24 '11 at 19:57
  • @mitchus- That's a good point, but perhaps there's a solution that is worst-case O(n^2), but could be a lot better based on the shape of the graph. For example, perhaps we could get something like O(n^2 / b), where b is the average branching factor in the tree, which would be as low as O(n). Even a solution like that would be great. – templatetypedef May 24 '11 at 22:16
  • 2
    @templatetypedef: Two of us have independently given you a solution that is `O(n b k)` where `n` is the number of people, `b` is the average branching factor of the tree, and `k` is the maximum number of generations alive at once. I doubt that it is possible to get much more efficient than having everyone updated once per child, per generation. – btilly May 24 '11 at 23:42
  • @templatetypedef, perhaps you should rewrite the question a little. First, do you want to specify that the relationship graph is already fully represented in memory before this algorithm starts? It has been pointed out, by @Alexey (and others?), that if we only have strings to identify the people at first, then some sort of hash lookup is required to link each person to the data of its parents. Also, the claim that the undirected graph is a tree is problematic I think - it might cause confusion. A rewrite might help us to focus. – Aaron McDaid May 31 '11 at 17:04
  • And another question about the question: What about the first people that are born, in particular the very first person, who are their parents? Do we just ignore that, and allow that each person has 0,1, or 2 parents? – Aaron McDaid May 31 '11 at 17:09

9 Answers9

11

Update: This is not the best solution I have come up with, but I've left it because there are so many comments relating to it.

You have a set of events (birth/death), parental state (no descendants, parent, grandparent, etc) and life state (alive, dead).

I would store my data in structures with the following fields:

mother
father
generations
is_alive
may_have_living_ancestor

Sort your events by date, and then for each event take one of the following two courses of logic:

Birth:
    Create new person with a mother, father, 0 generations, who is alive and may
        have a living ancestor.
    For each parent:
        If generations increased, then recursively increase generations for
            all living ancestors whose generations increased.  While doing that,
            set the may_have_living_ancestor flag to false for anyone for whom it is
            discovered that they have no living ancestors.  (You only iterate into
            a person's ancestors if you increased their generations, and if they
            still could have living ancestors.)

Death:
    Emit the person's name and generations.
    Set their is_alive flag to false.

The worst case is O(n*n) if everyone has a lot of living ancestors. However in general you've got the sorting preprocessing step which is O(n log(n)) and then you're O(n * avg no of living ancestors) which means that the total time tends to be O(n log(n)) in most populations. (I hadn't counted the sorting prestep properly, thanks to @Alexey Kukanov for the correction.)

btilly
  • 43,296
  • 3
  • 59
  • 88
  • Hi @btilly, I suggest you might edit your answer for clarity. For example I think it helps to clarify how you will go through all the births and deaths, and also to clarify that the `is_alive` and `may_have_living ancestor` fields are initialized to zero at the "start of time". That might help get some votes, or at least get this answer back to zero! – Aaron McDaid May 24 '11 at 13:59
  • @Aaron McDaid: You've hit on a minor difference in our solutions. You initialize all records at the start of time. I create records for people upon encountering their birth. I'm puzzled at why all answers got downvoted, but these things happen. – btilly May 24 '11 at 14:47
  • @btilly, Yes. And I've noticed another difference. You spotted another optimization also. If I am born, and my grandparent has already been marked as 'being a living grandparent', then there is no need to check back any further. I think this improves the complexity analysis also; now the worst case O(n*n) only happens if you have lots of living ancestors _and_ lots of people who have no brothers and sisters. – Aaron McDaid May 25 '11 at 10:36
  • 1
    I don't think this will work because of odd cases. For instance my grandfather might have died before I was born, but my great-grandfather would still be living and would still be counted even though my grandfather's generation count wouldn't have increased. – Jason Goemaat May 25 '11 at 20:43
  • @Jason Goemaat: That corner case is already accounted for. Your grandfather is correctly emitted as of the date of your grandfather's death. That will not change later when your grandfather's count increases later and correctly increases your great grandfather's count. To the best of my knowledge, there are no corner cases where this algorithm is wrong. – btilly May 25 '11 at 21:41
  • @Aaron McDaid: Ah yes, that is an important optimization. I assumed you had noticed, or if you had not that you would while coding. – btilly May 25 '11 at 21:43
  • @Jason, that is the purpose of the `may_have_a_living_ancestor` field. So your father and grandfather may be dead, but this field will be set for them until the greatgrandfather (and other older relatives) have died. The `may_have_a_living_ancestor` field is not switched off until we're certain that all ancestors are dead. – Aaron McDaid May 25 '11 at 22:22
  • 1
    However, "sort your events by date" means it's at least `Theta(n log n)` complex, is not it? – Alexey Kukanov May 26 '11 at 03:09
  • I think O(n*n) will only happen if everyone has exactly *one* descendant and if people live a long time. To explain that, first consider that for each parent only their first-born child is relevant. A second-born child cannot add more information to the ancestors than the first-born child. @btilly's algorithm accounts for this - when the second born child is processed, the parent will already be marked as being a parent, and the DFS will halt immediately. So I think the worst case is where everyone has many living ancestors _and_ few descendants (equivalently, few siblings). – Aaron McDaid May 28 '11 at 17:18
8

I thought of this this morning, then found that @Alexey Kukanov had similar thoughts. But mine is more fleshed out and has some more optimization, so I'll post it anyways.

This algorithm is O(n * (1 + generations)), and will work for any dataset. For realistic data this is O(n).

  1. Run through all records and generate objects representing people which include date of birth, links to parents, and links to children, and several more uninitialized fields. (Time of last death between self and ancestors, and an array of dates that they had 0, 1, 2, ... surviving generations.)
  2. Go through all people and recursively find and store the time of last death. If you call the person again, return the memoized record. For each person you can encounter the person (needing to calculate it), and can generate 2 more calls to each parent the first time you calculate it. This gives a total of O(n) work to initialize this data.
  3. Go through all people and recursively generate a record of when they first added a generation. These records only need go to the maximum of when the person or their last ancestor died. It is O(1) to calculate when you had 0 generations. Then for each recursive call to a child you need to do O(generations) work to merge that child's data in to yours. Each person gets called when you encounter them in the data structure, and can be called once from each parent for O(n) calls and total expense O(n * (generations + 1)).
  4. Go through all people and figure out how many generations were alive at their death. This is again O(n * (generations + 1)) if implemented with a linear scan.

The sum total of all of these operations is O(n * (generations + 1)).

For realistic data sets, this will be O(n) with a fairly small constant.

btilly
  • 43,296
  • 3
  • 59
  • 88
5

My suggestion:

  • additionally to the values described in the problem statement, each personal record will have two fields: child counter and a dynamically growing vector (in C++/STL sense) which will keep the earliest birthday in each generation of a person's descendants.
  • use a hash table to store the data, with the person name being the key. The time to build it is linear (assuming a good hash function, the map has amortized constant time for inserts and finds).
  • for each person, detect and save the number of children. It's also done in linear time: for each personal record, find the record for its parents and increment their counters. This step can be combined with the previous one: if a record for a parent is not found, it is created and added, while details (dates etc) will be added when found in the input.
  • traverse the map, and put references to all personal records with no children into a queue. Still O(N).
  • for each element taken out of the queue:
    • add the birthday of this person into descendant_birthday[0] for both parents (grow that vector if necessary). If this field is already set, change it only if the new date is earlier.
    • For all descendant_birthday[i] dates available in the vector of the current record, follow the same rule as above to update descendant_birthday[i+1] in parents' records.
    • decrement parents' child counters; if it reaches 0, add the corresponding parent's record into the queue.
    • the cost of this step is O(C*N), with C being the biggest value of "family depth" for the given input (i.e. the size of the longest descendant_birthday vector). For realistic data it can be capped by some reasonable constant without correctness loss (as others already pointed out), and so does not depend on N.
  • traverse the map one more time, and "label each person" with the biggest i for which descendant_birthday[i] is still earlier than the death date; also O(C*N).

Thus for realistic data the solution for the problem can be found in linear time. Though for contrived data like suggested in @btilly's comment, C can be big, and even of the order of N in degenerate cases. It can be resolved either by putting a cap on the vector size or by extending the algorithm with step 2 of @btilly's solution.

A hash table is key part of the solution in case if parent-child relations in the input data are provided through names (as written in the problem statement). Without hashes, it would require O(N log N) to build a relation graph. Most other suggested solutions seem to assume that the relationship graph already exists.

Alexey Kukanov
  • 12,479
  • 2
  • 36
  • 55
  • I think a little more care on generations is needed. But this is mostly correct. – btilly May 26 '11 at 16:40
  • I think generations are processed correctly, no additional care is needed. James Crook has described the same solution in his answer; his description is more theoretical while mine gives some implementation details. – Alexey Kukanov May 28 '11 at 19:26
  • @Alexey Kukanov: You assume that you only need to go back a fixed number of generations. If you set that number too high, you do too much work. If you set it too low, you get incorrect answers. See my second presented solution for a variation that effectively dynamically figures out that right value for `C` on the fly, making it both fast and correct. – btilly May 28 '11 at 20:55
  • @btilly: Actually, I did not assume a fixed number of generations, other than for cost analysis where I took C as the biggest dynamic number of generations for a given input. It migth be not clear from the description; I will see if I can fix it. – Alexey Kukanov May 28 '11 at 21:52
  • @Alexey Kukanov: Ah. But then consider the case of following a population of 1000 individuals through 10,000 generations. Your total population is a million, but you'll have billions of data points. The vast majority of it useless information about when a long-dead person could have become a great-(times 3000)-grandparent. – btilly May 29 '11 at 04:16
  • @btilly: I agree. Your step 2 finds the right point where to stop. – Alexey Kukanov May 29 '11 at 06:11
3

Create a list of people, sorted by birth_date. Create another list of people, sorted by death_date. You can travel logically through time, popping people from these lists, in order to get a list of the events as they happened.

For each Person, define an is_alive field. This'll be FALSE for everyone at first. As people are born and die, update this record accordingly.

Define another field for each person, called has_a_living_ancestor, initialized to FALSE for everyone at first. At birth, x.has_a_living_ancestor will be set to x.mother.is_alive || x.mother.has_a_living_ancestor || x.father.is_alive || x.father.has_a_living_ancestor. So, for most people (but not everyone), this will be set to TRUE at birth.

The challenge is to identify occasions when has_a_living_ancestor can be set to FALSE. Each time a person is born, we do a DFS up through the ancestors, but only those ancestors for which ancestor.has_a_living_ancestor || ancestor.is_alive is true.

During that DFS, if we find an ancestor that has no living ancestors, and is now dead, then we can set has_a_living_ancestor to FALSE. This does mean, I think, that sometimes has_a_living_ancestor will be out of date, but it will hopefully be caught quickly.

Aaron McDaid
  • 26,501
  • 9
  • 66
  • 88
  • I think I'm happy with my algorithm now. BUt now I've looked properly at @btilly 's algorithm and realised that it is probably the same as mine! Sorry btilly! – Aaron McDaid May 23 '11 at 23:08
  • Yes, we had exactly the same idea. – btilly May 24 '11 at 00:08
  • I would be quite surprise if your mother was not alive for your birth... [/nitpick] – Matthieu M. May 25 '11 at 20:01
  • @Matthieu M. Ever heard of the mother dying in child birth? The mother could have officially died minutes before the docs performed a Caesarian to save the baby. But if you were pointing out a serious flaw in the algorithm, I am not by any means correcting that :) I'm just in the mood for silly comments today. – abcd May 25 '11 at 20:58
  • @yoda: Ah true! You're right! And no, I've got no such thing as an intelligent remark on this algorithm. It works as far as I am concerned and I'm just trying to design a "faster" one :) – Matthieu M. May 26 '11 at 06:14
3

The following is an O(n log n) algorithm that work for graphs in which each child has at most one parent (EDIT: this algorithm does not extend to the two-parent case with O(n log n) performance). It is worth noting that I believe the performance can be improved to O(n log(max level label)) with extra work.

One parent case:

For each node x, in reverse topological order, create a binary search tree T_x that is strictly increasing both in date of birth and in number of generations removed from x. (T_x contains the first born child c1 in the subgraph of the ancestry graph rooted at x, along with the next earliest born child c2 in this subgraph such that c2's 'great grandparent level' is a strictly greater than that of c1, along with the next earliest born child c3 in this subgraph such that c3's level is strictly greater than that of c2, etc.) To create T_x, we merge the previously-constructed trees T_w where w is a child of x (they are previously-constructed because we are iterating in reverse topological order).

If we are careful with how we perform the merges, we can show that the total cost of such merges is O(n log n) for the entire ancestry graph. The key idea is to note that after each merge, at most one node of each level survives in the merged tree. We associate with each tree T_w a potential of h(w) log n, where h(w) is equal to the length of the longest path from w to a leaf.

When we merge the child trees T_w to create T_x, we 'destroy' all of the trees T_w, releasing all of the potential that they store for use in building the tree T_x; and we create a new tree T_x with (log n)(h(x)) potential. Thus, our goal is to spend at most O((log n)(sum_w(h(w)) - h(x) + constant)) time to create T_x from the trees T_w so that the amortized cost of the merge will be only O(log n). This can be achieved by choosing the tree T_w such that h(w) is maximal as a starting point for T_x and then modifying T_w to create T_x. After such a choice is made for T_x, we merge each of the other trees, one by one, into T_x with an algorithm that is similar to the standard algorithm for merging two binary search trees.

Essentially, the merging is accomplished by iterating over each node y in T_w, searching for y's predecessor z by birth date, and then inserting y into T_x if it is more levels removed from x than z; then, if z was inserted into T_x, we search for the node in T_x of the lowest level that is strictly greater than z's level, and splice out the intervening nodes to maintain the invariant that T_x is ordered strictly both by birth date and level. This costs O(log n) for each node in T_w, and there are at most O(h(w)) nodes in T_w, so the total cost of merging all trees is O((log n)(sum_w(h(w))), summing over all children w except for the child w' such that h(w') is maximal.

We store the level associated with each element of T_x in an auxiliary field of each node in the tree. We need this value so that we can figure out the actual level of x once we've constructed T_x. (As a technical detail, we actually store the difference of each node's level with that of its parent in T_x so that we can quickly increment the values for all nodes in the tree. This is a standard BST trick.)

That's it. We simply note that the initial potential is 0 and the final potential is positive so the sum of the amortized bounds is an upper bound on the total cost of all merges across the entire tree. We find the label of each node x once we create the BST T_x by binary searching for the latest element in T_x that was born before x died at cost O(log n).

To improve the bound to O(n log(max level label)), you can lazily merge the trees, only merging the first few elements of the tree as necessary to provide the solution for the current node. If you use a BST that exploits locality of reference, such as a splay tree, then you can achieve the above bound.

Hopefully, the above algorithm and analysis is at least clear enough to follow. Just comment if you need any clarification.

jonderry
  • 23,013
  • 32
  • 104
  • 171
  • *For each node x, in reverse topological order, create a binary search tree T_x that is strictly increasing both in date of birth and in number of generations removed from x.* How do you do this if my older brother Larry had his first children before I was born? The only way that I see is to not include me in the tree, at which point you won't be able to find out when I became a parent. – btilly May 31 '11 at 14:48
  • T_x only includes descendants of x, so it would not include any of x's brother's children (barring incest as specified in the OP, and even then, only x's children with x's sibling would be counted in T_x if we were to use a similar algorithm when incest was permitted). – jonderry May 31 '11 at 16:51
  • @jonderry: So you'd figure out me, then my parent. And will have trouble in the event of related people. If so then I believe your answer, but note that if you follow a population over time that "incest" will happen. (And is not even that bad - according to data from Iceland maximum fertility tends to happen between second cousins.) – btilly May 31 '11 at 17:29
  • Haha, ok I'll take your word for it on Iceland. Yes, the running time is not guaranteed O(n log n) in the event of some incest. But if there's not a lot of incest between close relatives, performance would still be good if pruning was done so that far off descendants could be forgotten about, if we were working with real data. – jonderry May 31 '11 at 17:45
  • You shouldn't take my word on it. According to http://www.scientificamerican.com/article.cfm?id=when-incest-is-best-kissi it is third or fourth cousins, not second. Also if you're looking at real data from, for example, breeding records for purebred cats, you'll find a *lot* of brother-sister matings as they try to get certain traits to breed true. – btilly May 31 '11 at 17:57
  • 1
    BTW, I thought about it some more and I think the extension to the two-parent case doesn't work. We can only get O(n log n) in the one parent case. – jonderry May 31 '11 at 19:19
2

I have a hunch that obtaining for each person a mapping (generation -> date the first descendant in that generation is born) would help.

Since the dates must be strictly increasing, we would be able to use use binary search (or a neat datastructure) to find the most distant living descendant in O(log n) time.

The problem is that merging these lists (at least naively) is O(number of generations) so this could get to be O(n^2) in the worst case (consider A and B are parents of C and D, who are parents of E and F...).

I still have to work out how the best case works and try to identify the worst cases better (and see if there is a workaround for them)

hugomg
  • 68,213
  • 24
  • 160
  • 246
  • +1 I think the first sentence here is a very good idea. I don't yet understand the rest of your answer though :-) I think the first sentence can be applied in a simple, fast, algorithm. (I think @btilly's answer is at least as efficient, maybe a little more so because it requires less data per person to be stored.) – Aaron McDaid May 25 '11 at 22:33
2

We recently implemented relationship module in one of our project in which we had everything in database and yes I think algorithm was best 2nO(m) (m is max branch factor). I multiplied operations twice to N because in first round we create relationship graph and in second round we visit every Person. We have stored bidirectional relationship between every two nodes. While navigating, we only use one direction to travel. But we have two set of operations, one traverse only children, other traverse only parent.

Person{
  String Name;

  // all relations where
  // this is FromPerson
  Relation[] FromRelations; 

  // all relations where
  // this is ToPerson
  Relation[] ToRelations;

  DateTime birthDate;
  DateTime? deathDate;
}

Relation
{
  Person FromPerson;
  Person ToPerson;
  RelationType Type;
}

enum RelationType
{
  Father,
  Son,
  Daughter,
  Mother
}

This kind of looks like bidirectional graph. But in this case, first you build list of all Person, and then you can build list relations and setup FromRelations and ToRelations between each node. Then all you have to do is, for every Person, you have to only navigate ToRelations of type (Son,Daughter) only. And since you have date, you can calculate everything.

I dont have time to check correctness of the code, but this will give you idea of how to do it.

void LabelPerson(Person p){
   int n = GetLevelOfChildren(p, p.birthDate, p.deathDate);
   // label based on n...
}

int GetLevelOfChildren(Person p, DateTime bd, DateTime? ed){
   List<int> depths = new List<int>();
   foreach(Relation r in p.ToRelations.Where(
             x=>x.Type == Son || x.Type == Daughter))
   {
      Person child = r.ToPerson;
      if(ed!=null && child.birthDate <= ed.Value){
         depths.Add( 1 + GetLevelOfChildren( child, bd, ed));
      }else
      {
         depths.Add( 1 + GetLevelOfChildren( child, bd, ed));
      }
   }
   if(depths.Count==0)
      return 0;
   return depths.Max();
}
Akash Kava
  • 39,066
  • 20
  • 121
  • 167
0

Here's my stab:

class Person
{
    Person [] Parents;
    string Name;
    DateTime DOB;
    DateTime DOD;
    int Generations = 0;

    void Increase(Datetime dob, int generations)
    {
        // current person is alive when caller was born
        if (dob < DOD)
            Generations = Math.Max(Generations, generations)
        foreach (Person p in Parents)
            p.Increase(dob, generations + 1);
    }

    void Calculate()
    {
        foreach (Person p in Parents)
            p.Increase(DOB, 1);
    }
}

// run for everyone
Person [] people = InitializeList(); // create objects from information
foreach (Person p in people)
    p.Calculate();
Jason Goemaat
  • 28,692
  • 15
  • 86
  • 113
-2
  • There's a relatively straightforward O(n log n) algorithm that sweeps the events chronologically with the help of a suitable top tree.

  • You really shouldn't assign homework that you can't solve yourself.

madman
  • 17
  • 2
    Can you elaborate on how to do this using a top tree? Also, I didn't give this question out as an assignment precisely because I didn't know the optimal solution... without giving students fair warning, I'd never give something out that I couldn't solve. :-) – templatetypedef May 23 '11 at 20:06
  • @templatetypedef Top trees essentially generalize Sleator-Tarjan trees. For this problem, the relevant feature is that you can make topology updates and evaluate fixed tree-recursive functions in time O(log n). – madman May 23 '11 at 20:17
  • I understand what a top tree is, but I still don't see how to use them to solve this problem. Can you give an example of how you would use them to speed up one of the tree-recursive functions that comes up in this problem? I'm not sure I see where this would be useful. – templatetypedef May 23 '11 at 20:19
  • I don't think I see how the total weight of a subtree is defined or relevant for this problem. I think that you may be on to something with a dynamic tree structure, but without a more detailed description I'm not sure whether or not this approach will pan out. – templatetypedef May 23 '11 at 21:08