18

Does anyone know how to figure out search time for a binary search tree(i.e. worst-case, best-case, and average-case)?

Robert Gowland
  • 7,677
  • 6
  • 40
  • 58

3 Answers3

35

For a non-self-balancing tree (possible but unusual for a search tree), worst case is O(n), which is for the degenerate binary tree (a linked list).

In this case, you have to search, on average, half the list before finding your desired element.

Best case is O(log2 n) for a perfectly balanced tree, since you cut the search space in half for every tree level.

Average case is somewhere in between those two and depends entirely on the data :-)

Since you rarely get to control the sequence in which data is inserted into a tree, self-balancing trees are usually preferable since, while they add a small amount of time to each insertion or deletion, they greatly speed up searching. Their worst case is so much better than unbalanced trees.

                 8
         _______/ \_______
        /                 \
       4                  12
    __/ \__             __/ \__
   /       \           /       \
  2         6        10        14
 / \       / \       / \       / \
1   3     5   7     9  11    13  15

In this perfectly balanced tree, you can see that you get 2n-1 nodes for every n levels. That means for 15 nodes, you never have to search more than four nodes to find it (e.g., to find 13, you search 8, 12, 14 and 13). That's where the log2n figure comes from.

A degenerate unbalanced tree, as already stated, is a linked list. If your data arrived in sequence and you inserted it into an unbalanced binary tree, you'd get:

1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9 -+
                                           |
+------------------------------------------+
|
+-> 10 -> 11 -> 12 -> 13 -> 14 -> 15

To find 13 in that case, you'd need to search 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 13, hence O(n).

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • but what if its completely random numbers and no where near balanced, is there some kind of method to figure out the cases? –  Feb 09 '09 at 00:05
  • No, because it depends both on your tree and what you're searching for. – paxdiablo Feb 09 '09 at 00:06
  • 1
    Algorithms exist to balance trees on insert/delete. Thus, random data is no problem for the right implementation. – Daniel Spiewak Feb 09 '09 at 00:08
  • That would have been a *balanced* or *self-balancing* binary tree. In my view, a binary tree is the simplistic one. – paxdiablo Feb 09 '09 at 00:09
  • I agree, but I think the information is worth presenting since the intuitive analysis can often be wrong if balanced trees are not well understood. – Daniel Spiewak Feb 09 '09 at 00:11
  • Shouldn't the best case be O(1) -- when the key is at the root of the tree? – Barun May 22 '14 at 14:36
  • @paxdiablo Average case depends entirely on data ? Average case means averaging over all input data which is present, then, I think dependency on a particular case is not done. By the way, best case should be O(1) when the key is the root of the tree. – 0decimal0 Sep 26 '14 at 17:55
  • @Carbon, there are an infinite number of possible data sets so you can't average all of them for an unbalanced tree. It may because I've been unclear that the first bit was talking about non-self-balancing trees, I'll clarify. As for best case being O(1), I don't believe that's the case. Complexity is a property of the algorithm, not the data. The only reason I stated data was important in the non-self-balanced tree is that the complexity can't be calculated except for a specific data set. – paxdiablo Sep 26 '14 at 23:56
13

Might want to tag this one as "homework". Here's a good starting point: http://en.wikipedia.org/wiki/Binary_search_tree

In general, a balanced binary search tree has a worst-case lookup of O(log n), best case of O(1) (when the desired value is the root) and an average case of O(log n) (the leaves contain exponentially more values than their parents).

The worst case is the most interesting and is easily seen by recognizing that the first level of a binary tree has 1 node, the second has 2, the third has 4 and so on. Thus, the number of nodes in a binary tree of depth n is precisely 2^n - 1. The mathematical inverse of the exponential function is the logarithm, thus: O(log n).

An unbalanced tree can be as bad as a linked list and may have a shape like the following:

  1
 / \
    2
   / \
      3
     / \
        4
       / \

In this situation, the worst-case access time is O(n).

Daniel Spiewak
  • 54,515
  • 14
  • 108
  • 120
  • 2
    An unbalanced tree is any tree that's not perfectly balanced (i.e., depth of one subtree is different to another subtree). What you're referring to (the linked list) is a degenerate tree. – paxdiablo Feb 09 '09 at 00:08
  • 1
    Changing my answer to be slightly more accurate. In general, if you do not specify "balanced", then the worst case is O(n), regardless of whether it is *actually* degenerate. – Daniel Spiewak Feb 09 '09 at 00:10
  • The picture in the post is known as - `Skewed Binary Search Tree` – RBT Dec 29 '17 at 02:32
2

Best case is O(1). the first element could be the item you are looking for. worst case is O(n) ie in a skewed tree and average case is O(lg n).

sjsupersumit
  • 154
  • 2
  • 11