Print random element from binary tree in O(logn)

Question

Given a binary tree with TreeNode like:

class TreeNode {
    int data;
    TreeNode left;
    TreeNode right;
    int size;
}

Where size is the number of nodes in the (left subtree + right subtree + 1).

Print a random element from the tree in O(logn) running time.

Note: This post is different from this one, as it is clearly mentioned that we have a size associated with each node in this problem.

PS: Wrote this post inspired from this.

generate a random number between 0 and your tree height (not size). Traverse that many nodes by randomly deciding left or right at each node using a random number again. Naive implementation, but it looks like O(n) to me and would be relatively easy to implement. — Araymer, Jan 11 '16 at 05:10
@Araymer Wrong, generate random number between 0 and size for O(logn) algorithm as mentioned in my post below. The whole point of this write up is to do it in O(logn). Also, generating a random number again and again adds another overhead to the time complexity. — Atri, Jan 11 '16 at 05:23
Because the height is logn.... And I didn't downvote. But I'm guessing it's because you gave no working code that shows your attempts at a solution. — Araymer, Jan 11 '16 at 05:24
Possible duplicate of [Getting a random number from a binary tree in O(log n) time](http://stackoverflow.com/questions/13300537/getting-a-random-number-from-a-binary-tree-in-olog-n-time) — Paul Hankin, Jan 11 '16 at 05:26
It should be noted, also, that if the tree isn't balanced there's no way to guarantee O(logn) — Araymer, Jan 11 '16 at 05:28
@PaulHankin I read that post, its not specifically given in the question that each node has a size associated with it. That is why I created this new post to be clear. — Atri, Jan 11 '16 at 05:32
@Atri it's the same problem. If you follow the linked answer, it involves adding a "size" to each node. — Paul Hankin, Jan 11 '16 at 05:47
@softwarenewbie7331 I have written this post as to share knowledge question-answer style. See my answer below. Is that the reason for downvote? — Atri, Jan 11 '16 at 06:03

score 2 · Accepted Answer · answered Jan 11 '16 at 05:03

There is an easy approach which gives O(n) complexity.

Generate a random number in the range of 1 to root.size
Do a BFS or DFS traversal
Stop iterating at random numbered element and print it.

For improving the complexity, we need to create an ordering of our own where we branch at each iteration instead of going sequentially as in BFS and DFS. We can use the size property of each node to decide whether to traverse through the left sub-tree or right sub-tree. Following is the approach:

Generate a random number in the range of 1 to root.size (Say r)
Start traversing from the root node and decide whether to go to left sub-tree, right-subtree or print root:
- if r <= root.left.size, traverse through the left sub-tree
- if r == root.left.size + 1, print the root (we have found the random node to print)
- if r > root.left.size + 1, traverse through the right sub-tree
Essentially, we have defined an order where current node is ordered at (size of left subtree of current) + 1.
Since we eliminate traversing through left or right sub-tree at each iteration, its running time is O(logn).

The pseudo-code would look something like this:

traverse(root, random)
  if r == root.left.size + 1
    print root        
  else if r > root.left.size + 1
    traverse(root.right, random - root.left.size - 1)
  else
    traverse(root.left, random)

Following is an implementation in java:

public static void printRandom(TreeNode n, int randomNum) {
    int leftSize = n.left == null ? 0 : n.left.size;
    if (randomNum == leftSize + 1)
        System.out.println(n.data);
    else if (randomNum > leftSize + 1)
        printRandom(n.right, randomNum - (leftSize + 1));
    else
        printRandom(n.left, randomNum);
}

score 0 · Answer 2 · answered Jan 11 '16 at 11:30

Use size!

Pick a random number q between 0 and n. Start from the root. If left->size == q return current node value. If the left->size < q the go right else you go left. If you go right subtract q -= left->size + 1. Repeat until you output a node.

This give you o(height of tree). If the tree is balanced you get O(LogN).

If the tree is not balanced but you still want to keep O(logN) you can do the same thing but cap the maximum number of iterations. Note that in this case not all nodes have the same probability of being returned.

With your approach, you never return the root. So all elements don't have the same probability of being returned. — Atri, Jan 11 '16 at 15:48

score 0 · Answer 3 · answered Jan 11 '16 at 23:32

Yes, use size!

As Sorin said, pick a random number i between 0 and n - 1 (not n)

Then perform the following instruction:

Treenode rand_node = select(root, i);

Where select could be as follows:

TreeNode select_rec(TreeNode r, int i) noexcept
{ 
  if (i == r.left.size) 
    return r;

  if (i < r.left.size) 
    return select_rec(r.left, i);

  return select_rec(r.right, i - r.left.size - 1);
}

Now a very important trick: the null node must be a sentinel node with size set to 0, what has sense because the empty tree has zero nodes. You can avoid the use of sentinel, but then the select() operation is lightly more complex.

If the trees is balanced, then select() is O(log n)

Print random element from binary tree in O(logn)

3 Answers3