Find all subtrees in a BST whose keys lie in a given range

Question

I was given this question during a recent interview: Given a BST whose nodes contains an Integer as value, find all subtrees whose nodes fall between integers X (min) and Y (max), where X<Y. These subtrees cannot overlap each other.

I have solved variations of this problem, example - print keys of a BST that fall in a given range. But couldn't figure this one out, since it involves finding all connected sub-graphs of the main graph/tree that satisfy very specific constraints. Any pointers/help/pseudo code is appreciated.

Added notes -

The problem defined the datastructure of node as having a left pointer, right pointer, and a integer value. There was no way to mark a node.
Was asked to solve this in Java.
When i said subtree/subgraph, i meant a connected set of nodes, not a list of disjoint nodes. sorry for the confusion.

Please respond by selecting an answer, so that we know the best answer to the given problem. — Sumeet, Jun 30 '15 at 05:05

score 1 · Answer 1 · answered Jun 28 '15 at 09:15

This is pretty much simple to solve. For having subtrees that do not overlap, i included a marked field, initialized to false for every node.

The algorithm is as follows:

Traverse the BST beginning from root using DFS method. Now if a node is encountered in DFS , that is not marked and it satisfies the constraint(falls between X and Y), then there is a solution with a subtree rooted at that node, but we do not know how big that subtree can be? So we do the following:

Pass its left and right child to another method check, which will do the following:

Traverse the subtree rooted at the node and traverse it in DFS fashion as long as constraints are satisfied and nodes encountered are unmarked. As soon as any one condition is violated, then return.

Now, the original DFS method may be called on already marked vertices but the if condition will evaluate to false. Hence the target is achieved.

I solved it using JAVA language and for condition that keys lie between 10 and 21(exclusive). Here is the code:

One more thing,if nothing is printed after subtree rooted at X with childs as, then it denotes a subtree with single node.

class BST
{
 public Node insert(Node x,int key)
 {
     if(x==null)
      return new Node(key,null,null,false);
     else if(key>x.key)
     {
         x.right=insert(x.right,key);
         return x;
     }
     else if(key<x.key)
     {
         x.left=insert(x.left,key);
         return x;
     }
     else {x.key=key;return x;}
 }

 public void DFS(Node x)
 {
     if(x==null)
     return;
     if(x.marked==false&&x.key<21&&x.key>10)
     {
         System.out.println("Subtree rooted at "+x.key+" with childs as");
         x.marked=true;
         check(x.left);
         check(x.right);
     }
     DFS(x.left);
     DFS(x.right);

 }
 public void check(Node ch)
 {
     if(ch==null)
      return;
     if(ch.marked==false&&ch.key<21&&ch.key>10)
     {
         System.out.println(ch.key);
         ch.marked=true;
         check(ch.left);
         check(ch.right);
     }
     else return;

 }
 public static void main(String []args)
 {
     BST tree1=new BST();
     Node root=null;
     root=tree1.insert(root,14);
     root=tree1.insert(root,16);
     root=tree1.insert(root,5);
     root=tree1.insert(root,3);
     root=tree1.insert(root,12);
     root=tree1.insert(root,10);
     root=tree1.insert(root,13);
     root=tree1.insert(root,20);
     root=tree1.insert(root,18);
     root=tree1.insert(root,23);   
     root=tree1.insert(root,15);
     tree1.DFS(root);
 } 
}
class Node
{
 Node left,right;
 int key;
 boolean marked;
 Node(int key,Node left,Node right,boolean b)
 {
  b=false;   
  this.key=key;
  this.left=left;
  this.right=right;
 }
}

Feel free for any queries.

Nico Schertler · Accepted Answer · 2015-06-28T09:39:15.473

The concrete solution depends on the definition of a subtree. Consider the following BST:

And we want to find the subtrees in the range [4,8]. It is obvious that the 4 node belongs to the output. But what about the other half tree? If a subtree refers to a node with all of its children, then that's the entire result. If a subtree is actually a subset of the input nodes, the nodes 5 and 8 belong to the result but their connections to the 3 and 9 nodes have to be stripped away.

In any case, the following algorithm can handle both. The preprocessor define WHOLE_SUBTREES defines whether subtrees are entire subcomponents with all children.

static List<BSTNode> FindSubtreesInRange(BSTNode root, int rangeMin, int rangeMax)
{
    var result = new List<BSTNode>();
    if (IsTreeWithinRange(root, rangeMin, rangeMax, int.MinValue, int.MaxValue, result))
        result.Add(root);
    return result;
}

static bool IsTreeWithinRange(BSTNode root, int rangeMin, int rangeMax, int treeRangeMin, int treeRangeMax, List<BSTNode> resultList)
{
    if (treeRangeMin >= rangeMin && treeRangeMax <= rangeMax)
        return true;
    if ( treeRangeMin > rangeMax || treeRangeMax < rangeMin)
        return false;

    if (root.Key < rangeMin)
    {
        if (root.Right != null && IsTreeWithinRange(root.Right, rangeMin, rangeMax, root.Key + 1, treeRangeMax, resultList))
            resultList.Add(root.Right);
        return false;
    }

    if (root.Key > rangeMax)
    {
        if (root.Left != null && IsTreeWithinRange(root.Left, rangeMin, rangeMax, treeRangeMin, root.Key, resultList))
            resultList.Add(root.Left);
        return false;
    }

    if (root.Left == null && root.Right == null)
        return true;

    if (root.Left == null)
    {
#if WHOLE_SUBTREES
        if (!IsTreeWithinRange(root.Right, rangeMin, rangeMax, root.Key + 1, treeRangeMax, resultList))
            root.Right = null;
        return true;
#else
        return IsTreeWithinRange(root.Right, rangeMin, rangeMax, root.Key + 1, treeRangeMax, resultList);
#endif
    }

    if (root.Right == null)
    {
#if WHOLE_SUBTREES
        if (!IsTreeWithinRange(root.Left, rangeMin, rangeMax, treeRangeMin, root.Key, resultList))
            root.Left = null;
        return true;
#else
        return IsTreeWithinRange(root.Left, rangeMin, rangeMax, treeRangeMin, root.Key, resultList);
#endif
    }

    var leftInRange = IsTreeWithinRange(root.Left, rangeMin, rangeMax, treeRangeMin, root.Key, resultList);
    var rightInRange = IsTreeWithinRange(root.Right, rangeMin, rangeMax, root.Key + 1, treeRangeMax, resultList);

    if (leftInRange && rightInRange)
        return true;

#if WHOLE_SUBTREES
    if (!leftInRange)
        root.Left = null;
    if (!rightInRange)
        root.Right = null;
    return true;   
#else
    if (leftInRange)
        resultList.Add(root.Left);
    if (rightInRange)
        resultList.Add(root.Right);
    return false;
#endif

}

The idea is as follows: If only one subtree of a given node lies within the given range, then this must be the root of a new subtree. If both lie in the range, then they are not the root of a subtree. Instead, the parent level should handle the according decision.

The algorithm starts with the following: We traverse the tree and remember in which ranges the keys may be (treeRangeMin/Max). This allows a fast check if an entire subtree lies in the given range (first statement of the IsTreeWithinRange method.

The next two statements handle the case if the current node's key lies outside the given range. Then only one of it's subtrees might be within the range. If that's the case, this subtree is added to the result list.

Next, we check if the subtrees exist. If both do not, then the current tree is completely contained within the range.

If only one subtree exists, then the action differs based on whether we may split trees. If we may split the tree, the following happens: If the subtree is not within the range, we cut it off and return true (because the current node is contained within the given range). If we may not split trees, we just propagate the result of the recursive call.

Lastly, if both children exist. If one of them is not contained within the range, we cut it off (if we are allowed to). If we are not allowed, we add the subtree to the result list that lies within the given range.

gen-y-s · Answer 3 · 2015-06-28T10:10:35.290

This can be done recursively, and we keep a list of subtrees which we append to whenever a compliant subtree is found. The recursive function returns true when the subtree rooted at the argument node is wholly in range. It's the caller decision (the parent node) to determine what to do when the child's recusruve call returns true or false. For example, if the current node value is in the range , and its children's subtrees are also completely in range, then we simply return true. But if only one of the children's subtrees is in the range, and the other is not in range, then we return false (since the not all of the current node subtree is in the range), but we also append the child that was in the range to the list. If the current node value is not in the range we return false, but we also check either the left or right child, and append it to the list of subtrees if it's compliant:

def subtree_in_range(root, x, y):
  def _subtree_in_range(node):
    in_range=True
    if node:
      if node.val>=x and node.val<=y:
        if not _subtree_in_range(node.left):
          in_range=False
          if node.right and _subtree_in_range(node.right):
            l.append(node.right)
        elif not _subtree_in_range(node.right):
          in_range=False
          if node.left:
            l.append(node.left)
      else:
        in_range=False
        s=node.left
        if node.val<x:
          s=node.right
        if s and _subtree_in_range(s):
          l.append(s)
    return in_range

  l=[]
  if _subtree_in_range(root):
    l.append(root)
  return l

score 1 · Answer 4 · answered Jun 28 '15 at 10:15

When doing range search, workhorse function for range, written in some generic language, might like this:

function range(node, results, X, Y) 
{
    if node is null then return
    if node.key is in [X, Y] then results.add(node.key)
    if node.key < Y then range(node.right, results, X, Y)
    if node.key > X then range(node.left, results, X, Y)
}

For subtree version problem we need to store subtree root nodes instead of keys and keep track if we are in subtree or not. The latter can be solved by passing subtree wise parent in range call, which also is required for new structure creation. Desired function is below. As you can see, main change is one extra argument and node.key in [X, Y] branch

function range_subtrees(node, parent, results, X, Y) 
{
    if node is null then return

    node_clone = null 

    if node.key is in [X, Y] then 
        node_clone = node.clone()
        if parent is null then 
            results.add(node_clone)
        else
            parent.add_child(node_clone)

    if node.key < Y then range_subtrees(node.right, node_clone, results, X, Y)
    if node.key > X then range_subtrees(node.left, node_clone, results, X, Y)
}

This should create a collection of subtree root nodes, where each subtree is a copy of original tree's structure.

Find all subtrees in a BST whose keys lie in a given range

4 Answers4