5

Recently, I'm trying to solve all the exercises in CLRS. but there are some of them i can't figure out. Here is one of them, from CLRS exercise 12.4-2:

Describe a binary search tree on n nodes such that the average depth of a node in the tree is Θ(lg n) but the height of the tree is ω(lg n). Give an asymptotic upper bound on the height of an n-node binary search tree in which the average depth of a node is Θ(lg n).

Can anyone share some ideas or references to solve this problem? Thanks.

Saeed Amiri
  • 22,252
  • 5
  • 45
  • 83
meteorgan
  • 246
  • 3
  • 9

3 Answers3

6

So let's suppose that we build the tree this way: given n nodes, take f(n) nodes and set them aside. Then build a tree by building a perfect binary tree where the root has a left subtree that's a perfect binary tree of n - f(n) - 1 nodes and a right subtree that's a chain of length f(n). We'll pick f(n) later.

So what's the average depth in the tree? Since we just want an asymptotic bound, let's pick n such that n - f(n) - 1 is one less than a perfect power of two, say, 2^k - 1. In that case, the sum of the heights in this part of the tree is 1*2 + 2*3 + 4*4 + 8*5 + ... + 2^(k-1) * k, which is (IIRC) about k 2^k, which is just about (n - f(n)) log (n - f(n)) by our choice of k. In the other part of the tree, the total depth is about f(n)^2. This means that the average path length is about ((n - f(n))log (n - f(n)) + f(n)^2) / n. Also, the height of the tree is f(n). So we want to maximize f(n) while keeping the average depth O(log n).

To do this, we need to find f(n) such that

  1. n - f(n) = Θ(n), or the log term in the numerator disappears and the height isn't logarithmic,
  2. f(n)^2 / n = O(log n), or the second term in the numerator gets too big.

If you pick f(n) = Θ(sqrt(n log n)), I think that 1 and 2 are satisfied maximally. So I'd wager (though I could be totally wrong about this) that this is as good as you can get. You get a tree of height Θ(sqrt(n log n)) that has average depth Θ(Log n).

Hope this helps! If my math is way off, please let me know. It's late now and I haven't done my usual double-checking. :-)

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • This is close but I think you want a perfect tree with a tail off one of the leaf nodes, rather than having a perfect left sub-tree with the whole right tree being a chain. Basically you want to pack as many nodes near the top of the tree as possible, then have one long chain coming off the tree. – Rusty Rob Feb 13 '12 at 09:56
  • @robertking- Hmm, that's a good point. Reworking the math doesn't make it seem like this does much asymptotically, because only O(log n) of the nodes from the chain would be discounted. But I think you're right. – templatetypedef Feb 13 '12 at 20:42
  • 1
    "In that case, the sum of the heights in this part of the tree is 1*2 + 2*3 + 4*4 + 8*5 + ... + 2^(k-1) * k". Why is the height of the root node 2? Do you mean 'sum of the depths'? If you mean depths then why is the depth of the root node of the left subtree 2? It should either be 1 if it refers to the depth in the main tree or 0 if it refers to the depth in left subtree. Depth goes 0, 1, 2, ... not 1, 2, 3, ... – Kakaji Sep 03 '17 at 03:42
0

If you are trying to maximize the height of a tree while minimizing the average depth of all the nodes of the tree, the unambiguous best shape would be an "umbrella" shape, e.g. a full binary tree with k nodes and height = lg k, where 0 < k < n, along with a single path, or "tail", of n-k nodes coming out of one of the leaves of the full part. The height of this tree is roughly lg k + n - k.

Now let's compute the total depth of all the nodes. The sum of the depths of the nodes of the full part is sum[ j * 2^j ], where the sum is taken from j=0 to j=lg k. By some algebra, the dominant term of the result is 2k lg k.

Next, the sum of the depths of the tail part is given by sum[i + lg k], where the sum is taken from i=0 to i=n-k. By some algebra, the result is approximately (n-k)lg k + (1/2)(n-k)^2.

Hence, summing the two parts above together and dividing by n, the average depth of all the nodes is (1 + k/n) lg k + (n-k)^2 / (2n). Note that because 0 < k < n, the first term here is O(lg n) no matter what k we choose. Hence, we need only make sure the second term is O(lg n). To do so, we require that (n-k)^2 = O(n lg n), or k = n - O(sqrt(n lg n)). With this choice, the height of the tree is

lg k + n - k = O( sqrt(n lg n) )

this is asymptotically larger than the ordinary O(lg n), and is asymptotically the tallest you can make the tree while keeping the average depth to be O(lg n)

xdavidliu
  • 2,411
  • 14
  • 33
0

first maximize the height of the tree. (have a tree where each node only has one child node, so you have a long chain going downward).

Check the average depth. (obviously the average depth will be too high).

while the average depth is too high, you must decrease the height of the tree by one. There are many ways to decrease the height of the tree by one. Choose the way which minimizes the average height. (prove by induction that each time you should select the one that minimizes the average height). Keep going until you fall under the average height requirement. (e.g. calculate using induction a formula for the height and the average depth).

Rusty Rob
  • 16,489
  • 8
  • 100
  • 116