19

I'm designing a self balancing tree in Haskell. As an exercise and because it is nice to have in your back hand.

Previously in C and Python I preferred Treaps and Splay Trees due to their simple balancing rules. I always disliked R/B Trees, since they seemed like more work than they were worth.

Now, due to the functional nature of Haskell, things seem to have changed. I can write a R/B insert function in 10 lines of code. Treaps on the other hand requires wrapping to store the random number generator, and Splay Trees are a pain to do top-down.

So I'm asking if you have experience with other types of trees? Which ones are better at utilizing the pattern matching and top-down nature of functional languages?

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
Thomas Ahle
  • 30,774
  • 21
  • 92
  • 114
  • 6
    Not a direct answer, but have a read of "Purely Functional Data Structures" to get some good ideas. – Jeff Foster Nov 12 '10 at 15:25
  • I like it. It doesn't go much into detailed structures, but offer a good general point of view. – Thomas Ahle Nov 12 '10 at 17:02
  • Do you need a search tree, or a tree representation of a sequence (like fingertrees - with concatenation and splitting)? In the latter case, purely functional 2-3 trees are trivial. – jkff Nov 18 '10 at 06:20

4 Answers4

8

Ok, I guess there wasn't a lot of references or research for answering this question. Instead I've taken the time to try your different ideas and trees. I didn't find anything a lot better than RB trees, but perhaps that's just search bias.

The RB tree can be (insertion) balanced with four simple rules, as shown by Chris Okasaki:

balance T (T R (T R a x b) y c) z d = T R (T B a x b) y (T B c z d)
balance T (T R a x (T R b y c)) z d = T R (T B a x b) y (T B c z d)
balance T a x (T R b y (T R c z d)) = T R (T B a x b) y (T B c z d)
balance T a x (T R (T R b y c) z d) = T R (T B a x b) y (T B c z d)
balance T a x b = T B a x b

AVL trees can be balanced in a similar pattern matching way. However the rules don't compress as well:

balance T (T (T a x b   dx) y c (-1)) z d (-2) = T (T a x b dx) y (T c z d  0) 0
balance T a x (T b y (T c z d   dz)   1 )   2  = T (T a x b  0) y (T c z d dz) 0
balance T (T a x (T b y c   1 )   1 ) z d (-2) = T (T a x b -1) y (T c z d  0) 0
balance T (T a x (T b y c (-1))   1 ) z d (-2) = T (T a x b  0) y (T c z d  1) 0
balance T (T a x (T b y c   _ )   1 ) z d (-2) = T (T a x b  0) y (T c z d  0) 0
balance T a x (T (T b y c   1 ) z d (-1))   2  = T (T a x b -1) y (T c z d  0) 0
balance T a x (T (T b y c (-1)) z d (-1))   2  = T (T a x b  0) y (T c z d  1) 0
balance T a x (T (T b y c   _ ) z d (-1))   2  = T (T a x b  0) y (T c z d  0) 0
balance t = t

As AVL trees seams to generally be considered inferior to RB trees, they are probably not worth the extra hassle.

AA trees could theoretically be balanced nice and easily by:

balance T n (T n a x b) y c = T n a x (T n b y c) -- skew
balance T n a x (T n b y (T n c z d)) = T (n+1) (T n a x b) y (T n c z d) --split
balance T n a x b = T n a x b

But unfortunately Haskell don't like the overloading of n. It is possible that a less standard implementation of AA trees, not using ranks, but something more similar to R and B, would work well.

Splay trees are difficult because you need to focus on a single node, rather than the static structure of the tree. It can be done by merging insert and splay.

Treaps are also uneasy to do in a functional environment, as you don't have a global random generator, but need to keep instances in every node. This can be tackled by leaving the task of generating priorities to the client, but even then, you can't do priority comparison using pattern matching.

Thomas Ahle
  • 30,774
  • 21
  • 92
  • 114
6

As you say Red Black trees aren't that hard to use. Have you given finger trees a look? You might be interested in augmenting your base data structure with something like a zipper. Another tree you might find interesting is the AA tree it is a simplification of Red Black Trees.

stonemetal
  • 6,111
  • 23
  • 25
  • Finger Trees look really cool. I'm very happy you let me discover this data structure. They do however not look particularly simple to implement. Most implementations I've seen use a 2-3 tree and generalizes the tree for a lot of use cases. – Thomas Ahle Nov 13 '10 at 19:42
  • 1
    You may be interested in AA trees, they are just simplified red black trees. – stonemetal Nov 16 '10 at 18:28
  • Thank you for pointing me at AA trees, I love them! Unfortunately their use of rank doesn't work well with pattern matching :( – Thomas Ahle Jun 03 '11 at 10:19
4

It's the one that's already implemented.

There are fine implementations in Haskell of balanced trees such as Data.Map and Data.Set. Don't they fulfill your needs? Don't reimplement, reuse.

svenningsson
  • 4,009
  • 1
  • 24
  • 32
  • 4
    For algorithmic purposes it often happens, that you need a tree, that stores specific information in each node. Be it the size of the subtrees in a statistic tree, or the right most point in an interval tree. – Thomas Ahle Jun 11 '11 at 11:50
1

The OCaml standard library uses an AVL tree for its map functor. It seems as though it's easier to implement than an RB-tree if you include a remove operation.

Niki Yoshiuchi
  • 16,883
  • 1
  • 35
  • 44