3

Given a sequence of numbers, I want to insert the numbers into a balanced binary tree such that when I do a inorder traversal on the tree, it gives me the sequence back.

How can I construct the insert method corresponding to this requirement?

Remember that the tree must be balanced, so there isn't a completely trivial solution. I was trying to do this with a modified version of an AVL tree, but I'm not sure if this can work out.

I also wish to be able to implement a delete operation. Delete should delete the item at the ith position in the list.

Edit: clarification:

I want to have: Insert(i, e), which inserts a single element e right before the ith element in the sequence. Delete(i), which deletes the ith element of the sequence.

If I do insert(0, 5), insert(0, 4), insert(0, 7), then my stored sequence is now 7, 4, 5 and inorder traversal on the binary tree should give me 7, 4, 5.

If I do delete(1), then inorder traversal on the binary tree should give me 7, 5. As you can see, the sequence order is maintained after deleting the ith sequence element with i = 1 in this case (as I've indexed my sequence from 0).

Olhovsky
  • 5,466
  • 3
  • 36
  • 47
  • 1
    Should this be done without extra space? – Kakira Feb 16 '11 at 23:24
  • Kakira, it cannot be done without extra space. You need at least n*sizeof(reference) extra space. – corsiKa Feb 16 '11 at 23:34
  • It seems like it should be possible to implement without using extra space, no? – Olhovsky Feb 16 '11 at 23:55
  • Nope, it can't be. You need references to hold the information of what order they were inserted in. You do not need space on the order of the size of the data. The size of the tree before adding the list would be, assuming a parent, left, and right, `Size = N * (3*ref + size(data))` where ref is the size of a reference, and N is the number of nodes. Adding this will cause it to be `Size = N * (4*ref + size(data))` or `Size = N * (5*ref + size(data))`, depending on if your list is doubly linked or not. The ~ORDER~ of space required is unchanged, but the constant *is* changed. – corsiKa Feb 17 '11 at 00:13
  • Oh I misunderstood, I thought that "extra" space meant a more than a constant multiple of the number of nodes. I see now that your comment excluded that possibility :P – Olhovsky Feb 17 '11 at 00:21
  • It can be done without extra space. Simply create the tree so that preorder traversal provides the right ordering. There's no need for an extra linked list. – divegeek Feb 17 '11 at 03:20
  • @divegeek, that is what I'm trying to achieve. Creating a tree from an initial sequence is trivially easy, my question is how to maintain said order upon additional insertions/deletions. – Olhovsky Feb 17 '11 at 11:02
  • In your edit you specified inorder traversal, where in your original statement of the problem you specified preorder traversal. Which is it? – divegeek Feb 17 '11 at 12:17

3 Answers3

1

Keep a linked list in the tree. You already have pointers/references to the nodes thanks to the tree. When you insert into the tree, also insert at the end of the linked list. When you remove from the tree, remove from the linked list. Because you have the references, they're all O(1) operations, and you can iterate through in insertion order if you so choose.

Edit: to clarify Bart's concern, here's a Java implementation

class LinkedTreeNode<E> {
   E data;
   LinkedTreeNode<E> parent;
   LinkedTreeNode<E> left;
   LinkedTreeNode<E> right;
   LinkedTreeNode<E> insertNext;
   LinkedTreeNode<E> insertPrev;
}

class LinkedTree<E> {
    LinkedTreeNode<E> root;
    LinkedTreeNode<E> insertHead;
    LinkedTreeNode<E> insertTail;

    // skip some stuff
    void add(E e) {
        LinkedTreeNode<E> node = new LinkedTreeNode<E>(e);

        // perform the tree insert using whatever method you like

        // update the list
        node.insertNext = insertTail;
        node.insertPrev = insertTail.insertPrev.insertPrev;
        node.insertPrev.insertNext = node;
        insertTail.insertPrev = node;
    }

    E remove(E e) {
        LinkedTreeNode<E> rem = // whatever method to remove the node
        rem.insertNext.insertPrev = rem.insertPrev;
        rem.insertPrev.insertNext = rem.insertNext;
        return rem.data;
    }
} 
corsiKa
  • 81,495
  • 25
  • 153
  • 204
  • Removing from a linked list is O(n) though, even if you know the reference (or index). A linked-hashset would do, I guess (`add(...)` and `remove(...)` in O(1) with the order in tact). – Bart Kiers Feb 16 '11 at 23:31
  • 1
    No, removing from a linked list is O(1) if you have the reference AND access to the pointers of the nodes. Using the method list.remove(obj) might be O(n), but I'm assuming that he has the ability to directly work with the nodes. – corsiKa Feb 16 '11 at 23:34
  • Then you're not talking about a `java.util.LinkedList`. See: [this Q&A](http://stackoverflow.com/questions/322715/when-to-use-linkedlist-over-arraylist): `remove(...)` is O(n). – Bart Kiers Feb 16 '11 at 23:36
  • That's correct, I'm not talking about a `java.util.LinkedList` - OP is looking for an algorithm, not a Java collection. – corsiKa Feb 16 '11 at 23:41
  • Never mind, I thought I saw the tag `java` somewhere. But many linked-lists have O(n) remove methods, so simply stating that such an operation is O(1) is a bit misleading (all IMHO, of course!). – Bart Kiers Feb 16 '11 at 23:44
  • Yes, you have O(lg n) removal from the tree. But because you found the node with the removal from the tree, you do not need to find it again to remove from the list. In fact, if you were so inclined, you could ~not~ remove the node from the linked list on removal from the tree, so you could get a clear picture of the true insertion order, and perhaps add a node to the end when you remove to serve as a removal marker. – corsiKa Feb 17 '11 at 00:11
  • @Moron, you're right -- and it's my fault for lack of specificity in my question. It's not very useful to simply tack a linked list onto the side of a binary tree. My question is really about how we can maintain a binary tree that produces the sequence given upon inorder traversal. – Olhovsky Feb 17 '11 at 10:29
  • @TheBigO: That's a very different problem than I read. If that were the case, internally, you need `int inserts`, and when you do an insert, use `inserts` as your key into the tree. It's worth noting that this won't give you O(lg n) lookup based on the value unless you actually maintained two trees (you could have them in the same nodes, just using different pointers) one for the actual values, and one for the insert order. – corsiKa Feb 18 '11 at 00:00
  • @glowcoder, thanks for the contributions, and sorry for the confusion. It turns out though that it can all be done in O(log n) time with a single AVL tree. – Olhovsky Feb 18 '11 at 12:14
  • @TheBigO Will this single tree still give you the ability to have O(lg n) lookup with values as keys? Or only using inserition order as keys... – corsiKa Feb 18 '11 at 15:05
  • @glowcoder It provides O(log n) look up of the item using the rank as a key. You're right, you'd need a second tree if you wanted to have two different lookup criteria in O(log n) time, (value as key vs. rank as key). – Olhovsky Feb 18 '11 at 15:52
  • @TheBigO And therein lies the difference. I (until the comment 15 hours before this one) was under the impression that you still needed to keep the ability to search by value as well as insertion order. – corsiKa Feb 18 '11 at 15:55
  • @glowcoder, Fair enough, although I never said in the question that searching by value was necessary, only that inorder traversal produced the sequence. For my application, the main problem with using a linked list in addition to the tree is that the list forces other operations to become O(n), such as summing a segment of the sequence, which can otherwise be done in O(log n) with a tree. In the future I'll provide more motivation for what I'm trying to do in my question. – Olhovsky Feb 18 '11 at 17:00
  • @TheBigO Definitely a good idea. The more relevant information the better! Of course, you need to insure you don't drown out the good stuff with noise, so it's always a tricky deal. – corsiKa Feb 18 '11 at 18:18
1

If you have random access to the contents of the sequence (e.g. you have an array), then you can simply build the tree directly, so that pre-order traversal provides the sequence you want. It can actually be done without random access, but less efficiently, since linear scans may be required to find the midpoint of the sequence.

The key observation here is that the first value in the sequence goes in the root of the tree, then the remainder of the sequence can be divided into halves. The first half will go into the subtree rooted at the left child and the second half will go into the subtree rooted at the right child. Because you are dividing the remaining list in half at each step, the resulting tree will have log_2 n levels and will be balanced.

Here's some C++ code. Note that "end" refers to the index of the element just past the end of the current array segment.

struct Node {
    Node(int v) : value(v), l(0), r(0) {}

    int value;
    Node* l;
    Node* r;
};

Node* vector_to_tree(int begin, int end, int array[]) {
    if (begin == end)
        return NULL;

    Node* root = new Node(array[begin++]);
    int mid = (begin + end) / 2;
    root->l = vector_to_tree(begin, mid, array);
    root->r = vector_to_tree(mid,   end, array);

    return root;
}

void preorder_traverse(Node* root) {
    if (!root)
        return;

    std::cout << root->value << std::endl;
    traverse(root->l);
    traverse(root->r);
}
divegeek
  • 4,795
  • 2
  • 23
  • 28
  • This is more along the lines of what I was looking for, however it appears that this will require O(n) removal, and O(n) inserts, if one is to insert an item after the initial sequence is given. – Olhovsky Feb 17 '11 at 10:27
  • I don't have time right now to think about it, but it seems like for insertion it should be possible to calculate the position in the tree of the new element based on its index in the sequence in at most O(lg n) time -- it may actually be doable in constant time. Then insertion requires walking the tree to the appropriate location plus some rotations to keep it balanced. Deletion should be similar. That's what my intuition says; I need to think it through in detail. – divegeek Feb 17 '11 at 12:15
  • you can find the item to delete in log n time, but if you delete it, you need to reorder the vector, which is O(n). – Olhovsky Feb 18 '11 at 12:07
1

This can be done with an AVL tree.

Add items to the AVL tree by adding to the rightmost node of the tree.

AVL balancing does not affect the inorder traversal property due to the nature of rotations.

Storing the size of the tree at each node, and maintaining these values for each node affected during inserts/deletes/balances can be done in O(log n) time. Storing the size of the tree at each node permits searching for an item by rank in the sequence, since the rank of an item in a tree is known by a node's left child's size + 1.

Deletion from the AVL tree can be accomplished by replacing a removed node with the leftmost node of the node's right subtree.

Insert and delete operations are then done in O(log n) time, since the tree is always balanced, and size updates on the nodes are done along a path from inserted/deleted node to root.

Since there is no backing data structure other than a binary tree, other operations that benefit from a tree can be done, such as finding the sum of an [m, n] range of elements in the sequence in O(log n) time.

Olhovsky
  • 5,466
  • 3
  • 36
  • 47