Fast serialization of BST using a difference list

Question

Background

I'm working through Ullmans Elements of ML programming in my spare-time. End goal is to self-study Andrew Appels Modern Compiler Implementation in ML.

In Elements of ML, Ullman describes the difference list:

There is a trick known to LISP programmers as difference lists, in which one manipulates lists more efficiently by keeping, as an extra parameter of your function, a list that represents in some way what you have already accomplished. The idea comes up in a number of different applications;

Ullman uses reverse as an example of the difference list technique. Here is a slow function that runs in O(n^2).

fun reverse nil = nil
  | reverse (x::xs) = reverse(xs) @ [x]

And the faster one using a difference list

fun rev1(nil, M) = M
  | rev1(x::xs, ys) = rev1(xs, x::ys)

fun reverse L = rev1(L, nil)

My problem

I have this Binary Search Tree (BST) data type.

datatype 'a btree = Empty
      | Node of 'a * 'a btree * 'a btree

A naive solution for collecting a list of the elements in pre-order would be

fun preOrder Empty = nil
  | preOrder (Node(x, left, right)) = [x] @ preOrder left @ preOrder right

But Ullman points out that the @ operator is slow and suggests in exercise 6.3.5 that I implement preOrder using a difference list.

After some head scratching I came up with this function:

fun preOrder tree = let
    fun pre (Empty, L)  = L
      | pre (Node(x, left, right), L) = let
          val L = pre(right, L)
          val L = pre(left, L)
        in
            x::L
        end
    in
       pre (tree, nil)
end

It outputs the elements in pre-order. BUT it evaluates the tree in post-order! And the code is uglier than the naive preOrder one.

> val t = Node(5, 
    Node(3, 
       Node(1, Empty, Empty), 
       Node(4, Empty, Empty)), 
    Node(9, Empty, Empty))
> preOrder t
val it = [5,3,1,4,9] : int list

Prior Art

I tried searching for references to difference lists in ML programming, and found John Hughes original article describing how to use difference lists for reverse.

I also found Matthew Brecknells difference list blog post with examples in Haskell. He makes a distinction between using an accumulator, like Ullmans reverse example and creating a new type for difference lists. He also presents a tree flattener. But I have a hard time understanding the Haskell code and would appreciate a similar expose but in Standard ML. abc

Question

How implement a function that actually evaluate the tree in pre-order and collects the elements in pre-order? Do I have to reverse the list after my traversal? Or is there some other trick?
How can I generalize this technique to work for in-order and post-order traversal?
What is the idiomatic way for using a difference list for a BST algorithm?

"expecting to find many references to the technique, but came up short-handed" this seems a bit dubious, I just searched it an there's tons of results. Or do you mean SML specific? I don't think you'll have trouble understanding examples in other functional languages and converting them to SML. As for "Can this be generalized..." : yes, you can write any recursive function which returns a chain of append operations `.. @ .. @ .. @ ..` in a difference-list style. — user2407038, Jul 07 '21 at 20:34
@user2407038 Yes I meant SML specific resources. I found John Hughes paper about the difference list technique and various prolog papers. I was hoping for lecture like material presenting more variations of the technique so that it would stick for me. — Daniel Näslund, Jul 07 '21 at 20:40
@DanielNäslund Unfortunately I cannot offer you any SML specific resources, as I'm not overly familiar with SML (Also, requests for external resources are technically OT). As far as "idiomatic" goes - I would say that the idiomatic way of doing this is to define a difference list type, and the concatenation operator for this type, which will allow you to write the difference-list version in a way which is syntactically very similar to the original. — user2407038, Jul 07 '21 at 20:52
@user2407038 I've tried to clarify my question. An example of a difference list type in SML would be a great answer! — Daniel Näslund, Jul 08 '21 at 05:28
@ggorlen It's stackoverflow who rewrites amazon links to use their amazon account. — Daniel Näslund, Jul 08 '21 at 05:28
Both functions produce the same result, and your example is the preorder. (And you can cut down `pre` to `x::pre(left, pre(right, L))`, which is very similar to the first version.) An actual problem with the second version is that it's not tail recursive. — molbdnilo, Jul 08 '21 at 07:05
@molbdnilo Ah, much better. Having to remember to make funcs tail-recursive seems like a bit of a burden for functional programming. But your function still walks the tree in post-order (but the returned list is in pre-order). So in that sense it's not the same as my first naive version using the @ operator. And how would you do a similar function to get inorder traversal? It seems weird to me that you have to do these kind of perf-adjustments in ML that alters the structure of the algorithm, when similar code is very straightforward in Python/C/C++/Java. — Daniel Näslund, Jul 08 '21 at 07:20
Ah, you're focusing on the order of evaluation, not the result. — molbdnilo, Jul 08 '21 at 07:33
@molbdnilo It's just that the algorithm is so easily expressed in the naive form. Having to go through these kind of contortions for speed-optimizations seems a little off to me: I thought ML was designed to be very straightforward to apply for these kind of tree algorithms. That the code would map close to the mathematical definition. Jumping through these kind of hoops looks inelegant to me. — Daniel Näslund, Jul 08 '21 at 07:37
It does map closely to the mathematical definition: a preorder list is the node value, followed by the preorder traversal of the left child, followed by the preorder traversal of the right child. Also, if you want to avoid the (most likely) non-tail recursion in other languages by using a loop, you need to maintain your own evaluation stack, and that's pretty far from elegant. — molbdnilo, Jul 08 '21 at 07:42
related: [tag:preorder]. related: [tag:tailrecursion-modulo-cons]. [related](http://www.dreamsongs.com/10ideas.html) (find "gopher"). a very clear Prolog answer is [this one](https://stackoverflow.com/a/26967655/849891). But it uses mutation -- or rather, explicit setting of a logical variable (which we are allowed to do, only once, to an _uninstantiated_ one). _This_ technique is emulated in Haskell with _"knot tying"_. now that one really ties one's mind in a knot! functional composition of partially applied append operators is _easy_ (in comparison). :) — Will Ness, Jul 09 '21 at 13:18
BTW the variables really ought to be numbered, for the sake of clarity, even if using the same name, `L`, is allowed by your language: `fun pre (Empty, L) = L` `| pre (Node(x, left, right), L1) = let` `val L2 = pre(right, L1)` `val L3 = pre(left, L2)` `in x::L3 end` makes the data flow _self-evident_. now it's not ugly, either. :) BTW in Haskell, with its lazy evaluation, this exact code will traverse the tree left-to-right! — Will Ness, Jul 09 '21 at 13:47
(contd.) same with molbdnilo's `pre (Node(x, left, right), L) = x::pre(left, pre(right, L))`: it walks the tree right-to-left in a strict language, but left-to-right in a lazy one! evaluation order is determined by the language's evaluation strategy. --- re the difference from the simple code, `[x] @ pre(left) @ pre(right)`, if we only look at the names (`left`, `x`, `right`) and see all the symbols as "noise", it's really the same. :) for the in-order, it's either `in(left) @ [x] @ in(right)`, or `in(left, x::in(right, L))` -- same principle: the names come in the same order. — Will Ness, Jul 09 '21 at 14:09

score 2 · Accepted Answer · edited Jul 09 '21 at 14:36

Your eventual method of doing this is is the best it reasonably gets. The nice way to do this turns out to be

fun preOrderHelper (Empty, lst) = lst
  | preOrderHelper (Node(x, left, right), lst) = 
        x :: preOrderHelper(left, preOrderHelper(right, lst))

fun preOrder tree = preOrderHelper(tree, Nil)

Note that the run time of preOrderHelper(tree, list) is only a function of tree. Call r(t) the run time of preOrderHelper on tree t. Then we have r(Empty) = O(1) and r(Node(x, left, right)) = O(1) + r(left) + r(right), so clearly r(t) is linear in the size of t.

What is the derivation of this technique? Is there a more principled way of deriving it? In general, when you're turning a data structure into a list, you want to foldr onto an empty list. I don't know enough ML to say what the equivalent of typeclasses is, but in Haskell, we would approach the situation as follows:

data Tree a = Empty | Node a (Tree a) (Tree a)

instance Foldable Tree where
   foldr f acc t = foldrF t acc  where
      foldrF Empty acc = acc
      foldrF (Node x left right) acc = f x (foldrF left (foldrF right acc))

To convert a Tree a to a [a], we would call Data.Foldable.toList, which is defined in Data.Foldable as

toList :: Foldable f => f a -> [a]
toList = foldr (:) []

Unfolding this definition gives us the equivalent of the ML definition above.

As you can see, your technique is actually a special case of a very principled way to turn data structures into lists.

In fact, in modern Haskell, we can do this totally automatically.

{-# LANGUAGE DeriveFoldable #-}

data Tree a = Empty | Node a (Tree a) (Tree a) deriving Foldable

will give us the equivalent^(*) of the above Foldable implementation automatically, and we can then immediately use toList. I don't know what the equivalent is in ML, but I'm sure there's something analogous.

The difference between ML and Haskell is that Haskell is lazy. Haskell's laziness means that the evaluation of preOrder actually walks the tree in the pre-Order order. This is one of the reasons I prefer laziness. Laziness permits very fine-grained control over the order of evaluation without resorting to non-functional techniques.

^(*) (up to the arguments order, which does not count in the lazy Haskell.)

So foldr on a tree gives me the pre-order representation in linear time? Can you use foldr for in-order and post-order traversals? — Daniel Näslund, Jul 09 '21 at 06:35
@DanielNäslund to tie this up back to your original "simple" `@`-based code, `foldr (:) [] t` is equivalent to `fold (map (\ x -> [x] ) t)` where `fold` appends together all the constituent lists. it knows to append them because when seen as monoid (which is what `fold` does), lists are combined by the `++` operator (which is `@` in ML). so it really _is_ the same (well, semantically. operationally, is a separate consideration). — Will Ness, Jul 09 '21 at 14:41
@DanielNäslund "Can you use `foldr` for in-order and post-order traversals?" that's a really interesting question. [this answer](https://stackoverflow.com/a/39180984/849891) says it's impossible (not very easily, at least). the order is baked in, in the `foldr`, and with the automatic deriving it is determined by the data type declaration. for the in-order traversal we would need `data Tree a = Empty | Node (Tree a) a (Tree a)` and for the post-order, `data Tree a = Empty | Node (Tree a) (Tree a) a`. — Will Ness, Jul 09 '21 at 15:16
that answer hints (and links) at some possibility to do that anyway. another possible way is in [this answer](https://stackoverflow.com/a/55672941/849891). it defines foldMap, but any foldMap gives rise to a foldr, and vice versa. — Will Ness, Jul 09 '21 at 15:28
Thanks @WillNess and Mark Saving. Monoids and laziness are two concepts I haven't had to deal with when writing ML code. I'll read up on your explanations and links and hopefully in due time I'll have enough context to fully understand your arguments. I'll wait with accepting the answer for another day or so, in the hope that someone will show up and give an explanation using just Standard ML. — Daniel Näslund, Jul 09 '21 at 20:18
@DanielNäslund You would want to define a `newtype` and provide a different `Foldable` implementation in order to do a post- or in-order traversal. So this technique does generalise. But it's not possible to write in- or post-order traversals in terms of the pre-order implementation of foldr, since foldr behaves the same on two structures which have the same `toList` value. — Mark Saving, Jul 09 '21 at 20:49

Will Ness · Answer 2 · 2021-07-10T12:23:50.473

What you show is not what I've seen usually referred to as difference list.

That would be, in pseudocode,

-- xs is a prefix of an eventual list xs @ ys,
-- a difference between the eventual list and its suffix ys:
dl xs = (ys => xs @ ys)

and then

pre Empty = (ys => ys)  -- Empty contributes an empty prefix
pre (Node(x, left, right)) = (ys =>
    --  [x] @ pre left @ pre right @ ys  -- this pre returns lists
    (dl [x] . pre left . pre right)  ys) -- this pre returns diff-lists
                        -- Node contributes an [x], then goes 
                        -- prefix from `left`, then from `right`

so that

preOrder tree = pre tree []

where . is the functional composition operator,

(f . g) = (x => f (g x))

Of course since dl [x] = (ys => [x] @ ys) = (ys => x::ys) this is equivalent to what you show, in the form of

--pre Empty = (ys => ys)  -- Empty's resulting prefix is empty
pre'  Empty    ys =  ys  

--pre (Node(x, left, right)) = (ys =>
pre'  (Node(x, left, right))    ys = 
    --     [x] @ pre  left @ pre  right @ ys
    -- (dl [x] . pre  left . pre  right)  ys
            x::( pre' left ( pre' right   ys))

-- preOrder tree = pre' tree []

Operationally, this will traverse the tree right-to-left in an eager language, and left-to-right in a lazy one.

Conceptually, seen left-to-right, the resulting list has [x] and then the result of traversing left and then the result of traversing right, no matter what was the tree traversal order.

These difference lists are just partially applied @ operators, and appending is just functional composition:

   dl (xs @ ys)     ==  (dl xs . dl ys)
 -- or:
   dl (xs @ ys) zs  ==  (dl xs . dl ys)  zs
                    ==   dl xs ( dl ys   zs)
                    ==      xs @   (ys @ zs)

the prefix xs @ ys is the prefix xs, followed by the prefix ys, followed by whatever the eventual suffix zs will be.

Thus appending these difference lists is an O(1) operation, the creation of a new lambda function which is a composition of the arguments:

append dl1 dl2 = (zs =>  dl1 ( dl2  zs))
               = (zs => (dl1 . dl2) zs )
               =        (dl1 . dl2)

Now we can easily see how to code the in-order or post-order traversals, as

in_ Empty = (ys => ys)
in_  (Node(x, left, right)) = (ys =>
    --  in_ left @    [x] @ in_ right @ ys
       (in_ left . dl [x] . in_ right)  ys)

post Empty = (ys => ys)
post  (Node(x, left, right)) = (ys =>
    --  post left @ post right @    [x] @ ys
       (post left . post right . dl [x])  ys)

Focusing on just lists [x] and their appending @ lets us treat this uniformly -- no need to concern ourselves with :: and its arguments which have different types.

The types of both arguments of @ are the same, just as they are for + with integers and indeed . with functions. Such types paired with such operations are known as monoids, under the condition that the appending operation is associative, (a+b)+c == a+(b+c), and there is an "empty" element, e @ s == s @ e == s. This just means that the combination operation is "structural" in some way. This works with apples and oranges, but atomic nuclei -- not so much.

Fast serialization of BST using a difference list

Background

My problem

Prior Art

Question

2 Answers2