0

Show that every n-node binary search tree is not equally likely (assuming items are inserted in random order), and that balanced trees are more probable than straight-line trees.

How is it prove mathematical?

afuzzyllama
  • 6,538
  • 5
  • 47
  • 64
Alexey Semenyuk
  • 3,263
  • 2
  • 32
  • 36

2 Answers2

4

Number of possible tree configurations: see With ' N ' no of nodes, how many different Binary and Binary Search Trees possible?

Number of ways to get a single line, most imbalanced, deepest tree with n nodes: 2^(n-1) Explanation: 2 ways to pick up first node (greatest or smallest) X 2 ways to pick up second node (greatest or smallest among the n-1 remaining nodes ... X 2 ways to pick up the (n-1)th node X 1 way to pick up the last node

Number of ways to add n items to a binary tree in such a way that it is balanced: Let g(n,m) denote the number of ways to merge two ordered lists by picking elements from one list or the other one at a time until both lists are empty. g(n,m) = g(n-1,m) + g(n,m-1) which happen to correspond to the numbers defined in the Pascal Triangle. That would give n+m chose n or n+m chose m = (n+m)! / n! (n+m-n)! = (n+m)!/(n! m!) Example: Number of ways to merge two ordered lists containing 2 elements each = 4!/(2! 2!) = 24 / 4 = 6 Assuming the two lists are as follows:

1 A
2 B

The six merging combinations are:

1 1 1 A A A
2 A A B 1 1
A 2 B 1 B 2
B B 2 2 2 B

Now, let f(n) denote the number of combinations in which n sorted elements can be added to a empty binary tree such that the binary tree is balanced. The only way to achieve that is to first add

  • the middle element if n is odd. That would be element ceiling(n/2). Each side would have n/2-1 elements.
  • either element n/2 or element n/2+1 if n is even. One side would have n/2-1 element, the other side n/2 elements and vice versa.

Once the middle element is selected, all elements to the left will always fall on the left subtree and all elements on the right will always fall on the right subtree. Assuming the elements on the right are ordered in such a way that the left subtree is balanced and the elements on the right are also ordered in such a way that the right subtree is balanced, merging the two lists in any way will always result in both subtree being balanced. That means that for each list of n and m elements that respectively fall on the left and right subtree such that both subtrees are balanced, there are (n+m)!/(n!m!) to merge them so as to achieve the same result.

That would give us (n+m)!/(n!m!) x f(n) x f(m)

We can now formulate this as follows: Base cases:

f(1) = 1
f(2) = 2

General case:

f(n) = (n-1)!/[((n-1)/2)!]^2 x [f((n-1)/2)]^2  | n odd
f(n) = (n-1)!/((n/2-1)! x (n/2)!) x 2 x f(n/2-1) x f(n/2) | n even

Rest to transform this into a non recursive formula in terms of n. Maybe it would be easier to start with the easiest case where n = 2^m - 1 (i.e. removing a node and dividing by two will always give a whole number and will result in a completely balanced tree).

Note: I posted a related math question here: https://math.stackexchange.com/questions/460767/recurrent-relation-for-number-of-ways-to-get-a-balanced-n-binary-tree

Following is a C# console application that lists all the sequences in which nodes can be added to a binary tree so as to have it balanced (Run it here ):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace AllBalancedTrees
{
    class Program
    {
        static void Main(string[] args)
        {
            char[] nodes = { 'A', 'B', 'C', 'D', 'E' };

            AllBalancedTrees<char> AllBalancedTrees = new AllBalancedTrees<char>(); 

            foreach (char[] a in AllBalancedTrees.allBalancedTreeCombinations(nodes))
            {
                foreach (char c in a)
                {
                    Console.Write(c + " ");
                }
                Console.WriteLine();
            }

            Console.ReadKey();
        }
    }

    class AllBalancedTrees<T>
    {
        public IEnumerable<T[]> allBalancedTreeCombinations(T[] nodes)
        {
            T[] result;
            if (nodes.Length == 1)
            {
                yield return nodes;
            }
            else if (nodes.Length == 2)
            {
                yield return nodes;
                T[] nodes2 = new T[2];
                nodes2[0] = nodes[1];
                nodes2[1] = nodes[0];
                yield return nodes2;
            }
            else if (nodes.Length % 2 == 1)
            {
                int mid = (nodes.Length - 1) / 2;
                T[] left = new T[mid];
                T[] right = new T[mid];
                Array.Copy(nodes, 0, left, 0, mid);
                Array.Copy(nodes, mid + 1, right, 0, mid);
                foreach (T[] l in allBalancedTreeCombinations(left))
                {
                    foreach (T[] r in allBalancedTreeCombinations(right))
                    {
                        result = new T[nodes.Length];
                        result[0] = nodes[mid];
                        foreach (T[] a in allMergeCombinations(l, r))
                        {
                            Array.Copy(a, 0, result, 1, a.Length);
                            yield return result;
                        }
                    }
                }
            }
            else
            {
                int mid = (nodes.Length) / 2;
                T[] left = new T[mid];
                T[] right = new T[mid - 1];
                Array.Copy(nodes, 0, left, 0, mid);
                Array.Copy(nodes, mid + 1, right, 0, mid - 1);
                foreach (T[] l in allBalancedTreeCombinations(left))
                {
                    foreach (T[] r in allBalancedTreeCombinations(right))
                    {
                        result = new T[nodes.Length];
                        result[0] = nodes[mid];
                        foreach (T[] a in allMergeCombinations(l, r))
                        {
                            Array.Copy(a, 0, result, 1, a.Length);
                            yield return result;
                        }
                    }
                }

                mid = nodes.Length / 2 - 1;
                left = new T[mid];
                right = new T[mid + 1];
                Array.Copy(nodes, 0, left, 0, mid);
                Array.Copy(nodes, mid + 1, right, 0, mid+1);
                foreach (T[] l in allBalancedTreeCombinations(left))
                {
                    foreach (T[] r in allBalancedTreeCombinations(right))
                    {
                        result = new T[nodes.Length];
                        result[0] = nodes[mid];
                        foreach (T[] a in allMergeCombinations(l, r))
                        {
                            Array.Copy(a, 0, result, 1, a.Length);
                            yield return result;
                        }
                    }
                }


            }
        }

        public IEnumerable<T[]> allMergeCombinations(T[] firstArray, T[] secondArray)
        {
            if (firstArray.Length == 0)
            {
                yield return secondArray;
            }
            else if (secondArray.Length == 0)
            {
                yield return firstArray;
            }
            else
            {
                T[] result;

                T[] firstMinusOne;
                firstMinusOne = new T[firstArray.Length - 1];
                Array.Copy(firstArray, 1, firstMinusOne, 0, firstMinusOne.Length);
                foreach (T[] a in allMergeCombinations(firstMinusOne, secondArray))
                {
                    result = new T[firstArray.Length + secondArray.Length];
                    result[0] = firstArray[0];
                    Array.Copy(a, 0, result, 1, a.Length);
                    yield return result;
                }

                T[] secondMinusOne;
                secondMinusOne = new T[secondArray.Length - 1];
                Array.Copy(secondArray, 1, secondMinusOne, 0, secondMinusOne.Length);
                foreach (T[] a in allMergeCombinations(firstArray, secondMinusOne))
                {
                    result = new T[firstArray.Length + secondArray.Length];
                    result[0] = secondArray[0];
                    Array.Copy(a, 0, result, 1, a.Length);
                    yield return result;
                }

            }
        }
    }
}
Community
  • 1
  • 1
Tarik
  • 10,810
  • 2
  • 26
  • 40
  • thanks for helping... I found in this article only a formula for total count binary search tree of n-size. I am interesting the formula for a total count balanced binary search tree of n-size. It is simple for straight-line trees, for all this trees will be 2 variant. – Alexey Semenyuk Aug 05 '13 at 07:44
  • I kept thinking about the problem and I think I got a recursive formulation for it but not a formula. I will need some time to explain the idea rather than just providing a recursive algorithm. – Tarik Aug 05 '13 at 13:12
  • Straight line binary trees can indeed come in two ways. However, the other forms of completely imbalanced trees can come in many more combinations as previously described. Are you only concerned about straight line trees and balanced trees? – Tarik Aug 05 '13 at 13:19
  • Yes, I should prove only balanced trees are more probable than straight-line trees – Alexey Semenyuk Aug 05 '13 at 13:38
  • Probability position n-node binary search tree equals 1/n : 1-node should be root => P(1)=1 ; 2-node less or more 1-node => P(2)=1/2; 3-node, because 3 childs are free(null) P(3)=1/3.May be, is this proof better for the first part(show that every n-node binary search tree is not equally likely )? – Alexey Semenyuk Aug 05 '13 at 13:59
0

Every n-node binary search tree is not equally likely because, if items are inserted in random order, there is a much greater probability that the input will not be in strictly increasing or decreasing order. As long as the items are not in ascending or descending order, a straight-line tree is an impossibility. For example, in a tree of four records with keys 1, 2, 3, and 4, there are 4! or 24 ways for these items to be ordered as input. Only two of these ways can result in a straight-line tree (4, 3, 2, 1 and 1, 2, 3, 4). In this case the probability of a straight-line tree is approximately 8.3%, whereas the probability of a (somewhat) balanced tree is 91.6%. Clearly, the odds are against the chances of getting a straight-line tree for a result.

jaxkewl
  • 195
  • 10