2

I came to know the height of Random-BST/Red-Black trees and some other trees are O(log n).

I wonder, how this can be. Lets say I have a tree like this

Binary tree

The height of the tree is essentially the depth of the tree, which is in this case will be 4 (leaving the parent depth). But how could people say that the height can be represented by O(log n) notion?

I'm very to algorithms, and this point is confusing me a lot. Where I'm missing the point?

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
sriram
  • 8,562
  • 19
  • 63
  • 82
  • possible duplicate of [Binary Tree Height](http://stackoverflow.com/questions/1951091/binary-tree-height) – Kristopher Micinski Sep 09 '12 at 04:01
  • @KristopherMicinski: It can't be! – sriram Sep 09 '12 at 04:02
  • explain, why can't it be? It looks like the same answer, you can prove the formula by induction on the structure of trees, the base case is a leaf, the inductive case assumes two subtrees. This is called a _complete_ binary tree, but in general O(log n) will be an upper bound. – Kristopher Micinski Sep 09 '12 at 04:04
  • @KristopherMicinski: So what you typically saying is that "the maximum height of the tree can be of `n` and the minimum height is of `O(log n)` in case of Red black tress and other trees which are self balanced. Am I right or missing something here? – sriram Sep 09 '12 at 04:07
  • of course, if you have n nodes, the degenerate case is a chain, which has n nodes, and the lower bound on height (not upper bound as I previously said, that was a typo) is a lower bound. You can prove it inductively using the principle I just mentioned. – Kristopher Micinski Sep 09 '12 at 04:09
  • @KristopherMicinski: Cool. Your comments seems to be correct. Why can't put it as an answer? – sriram Sep 09 '12 at 04:10
  • Check out [Big O(h) vs. Big O(logn) in trees](http://stackoverflow.com/questions/12258114/big-oh-vs-big-ologn-in-trees). I tried to explain the difference using a draw of two valid binary search trees - one is balances and the other is not. – amit Sep 09 '12 at 05:52
  • see my answer here: http://stackoverflow.com/questions/2307283/what-does-olog-n-mean-exactly/13093274#13093274 – 2cupsOfTech Oct 26 '12 at 20:11

2 Answers2

2

In algorithm complexity the variable n typically refers to the total number of items in a collection or involved in some calculation. In this case, n is the total number of nodes in the tree. So, in the picture you posted n=31. If the height of the tree is O(log n) that means that the height of the tree is proportional to the log of n. Since this is a binary tree, you'd use log base 2.

⌊log₂(31)⌋ = 4

Therefore, the height of the tree should be about 4—which is exactly the case in your example.

DaoWen
  • 32,589
  • 6
  • 74
  • 101
  • 1
    this is correct but slightly misleading, in general this formula is only used with a ceiling on the logarithm, so you're only ever talking about whole numbers, and as I explained in my comment, this is an upper bound, which is only true in the case of a complete binary tree. – Kristopher Micinski Sep 09 '12 at 04:07
  • Also note that the difference between different log bases is just a constant factor, which is why you don't see the base written in big O notation. – Laurence Gonsalves Sep 09 '12 at 04:16
  • @LaurenceGonsalves in this case it actually does matter, because the log base depends on the branching factor. – Kristopher Micinski Sep 09 '12 at 04:18
  • @Kristopher Micinski - Thanks for the good point—I've updated my answer. However, I'm pretty sure that using `floor` here gives a cleaner answer than `ceil` would. – DaoWen Sep 09 '12 at 04:20
  • The base is useful in the explanation, but inside the big-O it doesn't matter. Increasing the branching factor by a constant just decreases the balanced height by a constant factor. (You can prove this using the laws of logarithms.) For example, switching from a branching factor of 2 to a branching factor of 8 just makes the balanced tree 1/3 as tall. That is, O(log_8(n)) = O(1/3 * log_2(n)), and so we might as well leave out the base entirely: O(log(n)) – Laurence Gonsalves Sep 09 '12 at 18:31
  • @LaurenceGonsalves yes, wrt to complexity that's correct, but if you actually need to calculate the lower bound (e.g., for allocating memory or something..) then it would matter, I'm saying that there is more that matters here than the big-O complexity.. – Kristopher Micinski Sep 09 '12 at 19:09
1

As I explained in a comment, a binary tree can have multiple cases:

  • In the degenerate case, a binary tree is simply a chain, and its height is O(n).
  • In the best case (for most search algorithms), a complete binary tree has the property that for any node, the height of the subtrees are the same. In this case the length will be the floor of log(n) (base 2, or base k, for k branches). You can prove this by induction on the size of the tree (structural induction in the constructors)
  • In the general case you will have a mix of these, a tree constructed where any node has subtress with possibly different height.
Kristopher Micinski
  • 7,572
  • 3
  • 29
  • 34
  • I think the height function should really use `floor` rather than `ceiling`. Take the case of going from `n=3` to `n=4` for instance: if you use `ceiling` then the height does not change but if you use `floor` the height correctly changes from 1 to 2. – DaoWen Sep 09 '12 at 04:32
  • @DaoWen yup, forgot to actually draw and hastily wrote from intuition :) – Kristopher Micinski Sep 09 '12 at 04:34