18

The C++ STL class std::map implements O(log(n)) look-up using a binary tree. But with trees, it's not immediately obvious how an iterator would work. What does the ++ operator actually mean in a tree structure? Whereas the concept of "next element" has an obvious implementation in an array, for me it's not so obvious in a tree. How would one implement a tree iterator?

Publius
  • 1,184
  • 2
  • 10
  • 27
  • You can have a look at the source as a starter: http://www.sgi.com/tech/stl/stl_map.h – amit Sep 04 '12 at 08:33
  • 1
    Look at a typical [self-balancing binary search tree](http://en.wikipedia.org/wiki/Red%E2%80%93black_tree). It's easy to see an algorithm that gets from a given node to the next bigger one by looking to the right children or going up and down the tree. Occasionally you have to jump several times, but it's still amortized constant time (since the height of the tree is the logarithm of the number of elements). – Kerrek SB Sep 04 '12 at 08:46
  • 1
    This wikipedia article may answer some of your questions: [Tree traversal](http://en.wikipedia.org/wiki/Tree_traversal). Basically, the "next" element can be different depending on the kind of traversal you use. In the case of `std::map`, the tree is traversed in order (from the smallest element to the greatest). – Luc Touraille Sep 04 '12 at 08:48

4 Answers4

9

For an inorder traversal (probably works for others too), if you have a parent-pointer in your nodes you can do a non-recursive traversal. It should be possible to just store two pointers in your iterator: you need an indication of where you are, and you'll probably (I'm not doing the research now) need something like a "previous" pointer so you can figure out your current movement direction (i.e. do I need to go into the left subtree, or did I just come back from it).

"Previous" will probably be something like "parent", if we've just entered the node; "left" if we're coming back from the left subtree, "right" if we are coming back from the right subtree, and "self" if the last node we returned was our own.

Christian Stieber
  • 9,954
  • 24
  • 23
  • 1
    An implementation may know that node pointers are word-aligned and abuse the lower bits of the node pointer to store the state, instead of using a second pointer. – MSalters Sep 04 '12 at 09:33
  • I think this is what I needed. Essentially I'm trying to create an iterator for a generic tree class and this appears to work for n-ary unordered trees. – Publius Sep 05 '12 at 03:25
5

I would like to add my two cents worth as a comment, but since I am not able to I shall have to add an answer.  I have been googling and was frustrated because all the answers I found, these excepted, assumed a stack or some other variably-sized data structure. I did find some code.  It shows that it can be done without a stack but I found it hard to follow and so decided to attack the problem from first principles.

The first thing to note is that the algorithm is "left-greedy".  Thus, when we start at the root we immediately go as far left as possible, since the leftmost node is the one we need first.  This means that we never need to consider the left-subtree.  It has already been iterated over.

The order of iteration is left subtree, node, right subtree.  So if we are positioned at a given node we know that its left subtree and the node itself have been visited and that we should next visit the right subtree, if any, going as far left as possible.

Otherwise, we must go up the tree.  if we are going from a left child to its parent then the parent comes next.  (Afterwards we will visit its right subtree, as already covered.)

The final case is when we are going from a right child to its parent.  The parent has been visited already so we must go up again.  In fact we must keep going up until we reach the root or the tree, or find ourselves moving to a parent from its left child. As we have already seen, the parent is the next node in this case.  (The root may be indicated by a null pointer, as in my code, or some special sentinel node.)

The following code could easily be adapted for an STL-style iterator

// Go as far left from this node as you can.
// i.e. find the minimum node in this subtree
Node* Leftmost(Node* node)
{
    if (node == nullptr)
        return nullptr;
    while (node->left != nullptr)
        node = node->left;
    return node;
}

// Start iterating from a root node
Node* First(Node* root)
{
    return Leftmost(root);
}

// The iteration is current at node.  Return the next node
// in value order.
Node* Next(Node* node)
{
    // Make sure that the caller hasn't failed to stop.
    assert(node != nullptr);

    // If we have a right subtree we must iterate over it,
    // starting at its leftmost (minimal) node.

    if (node->right != nullptr)
        return Leftmost(node->right);
    
    // Otherwise we must go up the tree

    Node* parent = node->parent;
    if (parent == nullptr)
        return nullptr;

    // A node comes immediately after its left subtree

    if (node == parent->left)
        return parent;

    // This must be the right subtree!
    assert(node == parent->right);

    // In which case we need to go up again, looking for a node that is
    // its parent's left child.

    while (parent != nullptr && node != parent->left)
    {
        node = parent;
        parent = node->parent;
    }

    // We should be at a left child!
    assert(parent == nullptr || node == parent->left);

    // And, as we know, a node comes immediately after its left subtree

    return parent;
}
jbdp
  • 51
  • 1
  • 2
2

Consider the set of all elements in the map that are not less than the current element that are also not the current element. The "next element" is the element from that set of elements that is less than all other elements in that set.

In order to use a map, you must have a key. And that key must implement a "less than" operation. This determines the way the map is formed, such that the find, add, remove, increment, and decrement operations are efficient.

Generally the map internally uses a tree of some kind.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
1

Standard implementation of map iterator operator++ watch in stl_tree.h:

_Self&
operator++() _GLIBCXX_NOEXCEPT
{
_M_node = _Rb_tree_increment(_M_node);
return *this;
}

_Rb_tree_increment implementation is discussed here

Max Popov
  • 357
  • 2
  • 12