Left associative binary tree fold

Question

With the normal right associative tree fold I can define pre-/in-/post-order just by rearranging the concatenation in the supplied function f:

const treeFoldr = f => acc => function go(t) {
  if (t[TAG] === "Leaf") return acc;
  else return f(t.v) (go(t.l)) (go(t.r));
};

const TAG = Symbol.toStringTag;

const N = (l, v, r) => ({[TAG]: "Node", l, v, r});
const L = {[TAG]: "Leaf"};

const foo = N(
  N(
    N(L, 4, L),
    1,
    N(L, 5, L)
  ),
  0,
  N(
    L,
    2,
    N(L, 3, L)
  )
);

const r1 = treeFoldr(
  x => acc1 => acc2 => {return [x].concat(acc1).concat(acc2)}) ([]) (foo); // pre-order

const r2 = treeFoldr(
  x => acc1 => acc2 => {return acc1.concat([x]).concat(acc2)}) ([]) (foo); // in-order
  
const r3 = treeFoldr(
  x => acc1 => acc2 => {return acc1.concat(acc2).concat([x])}) ([]) (foo); // post-order

console.log(r2); // in-order [4,1,5,0,2,3]

Now I guess the same should be possible with a left associative fold, where f's result is passed to the next recursive go invocation. But all I could come up with is this hard-coded pre-order fold:

treeFoldl = f => acc => function go(t) {
  if (t[TAG] === "Leaf") return acc;
  
  else {
    acc = f(acc) (t.v);
    acc = go(t.l);
    return go(t.r);
  }
};

In order to get the desired design I would somehow have to incorporate two accumulators (because of f's signature) and the recursive calls of the left/right node, but I don't have a clue how.

It's probably quite easy, but I just don't see the wood for the trees..

[EDIT]

As requested in the comments here is a pure version of treeFoldl:

const treeFoldl = f => init => t => function go(acc, u) {
  if (u[TAG] === "Leaf") return acc;
  
  else {
    const acc_ = go(f(acc) (u.v), u.l);
    return go(acc_,   u.r);
  }
} (init, t);

Can you rewrite your `treeFoldl` to be pure, please? That assignment to `acc` makes it really confusing — Bergi, Nov 25 '21 at 22:16
by "desired design" i suppose you mean a folding function that allows for the same interface as demonstrated in your `treeFoldr`, but instead has evolves a flat, tail-recursive process? i will suggest it is fundamentally impossible as the sequencing of the `f` calls must necessarily have an order determined by the fold implementation. in your attempt, you notice a hard-coded pre-order, but rearranging the operations you could achieve in-order and post-order as well. — Mulan, Nov 26 '21 at 01:01
this supports the reason why i would recommend not to think of this as `treeFoldR` and `treeFoldL` and instead think of the various folds based on their name, `inorder`, `preorder`, `postorder`. your proposed `treeFoldL` has two calls to `go` and so it still evolves a recursive (nonlinear) process. `inorder`, `preorder`, and `postorder` can all be implemented with tail recursion using cps. — Mulan, Nov 26 '21 at 01:05
[this related Q&A](https://stackoverflow.com/a/49473130/633183) should be familiar to you, and [the other answer](https://stackoverflow.com/a/49458106/633183) contains implementations of each traversal — Mulan, Nov 26 '21 at 01:15
@Bergi I added a pure `treeFoldl` but now it's harder to see how to construct the in-/post-order version. At least at the operational level both `treeFoldl`'s should be equivalent, because the impure one relies on reassignment, not mutation, which is basically the same as providing a fresh name binding in a new scope, isn't it? — , Nov 26 '21 at 10:09
@Mulan My q amounts to whether I can process both accumulators before the base cases are reached and let `f` decide, how to concat them. The underlying premise is that left/right folds are equivalent, but maybe in a strict sense these aren't folds anymore? I'll try the CPS transformation of the right fold, which naturally should yield a left one. — , Nov 26 '21 at 10:26
@Mulan I made the CPS tranform (essentially `go(t.l) (acc1 => go(t.r) (acc2 => k(f(acc1) (acc2) (t.v))))` but it's still a rigt fold. I guess it's not possible to implement `treeFoldl` in terms of `treeFoldr`. — , Nov 26 '21 at 13:09
I just want to thank all the participants for a very interesting and enlightening Q + A. — Scott Sauyet, Nov 29 '21 at 14:18

Bergi · Answer 1 · 2021-11-26T20:00:48.213

4

With the normal right associative tree fold I can define pre-/in-/post-order just by rearranging the concatenation in the supplied function f

What you have defined there is not a foldR function. It is actually a catamorphism for your tree structure (see also this answer), and has exactly this advantage that you can implement anything with it. You can implement fmap, constructing a new tree of the same shape. Or you can implement the linear left fold and right fold:

// cata :: ((b, a, b) -> b), () -> b) -> Tree a -> b
const cata = (onNode, onLeaf) => tree => {
  if (t[TAG] === "Leaf")
    return onLeaf();
  else
    return onNode(
      cata(onNode, onLeaf)(t.l),
      t.v,
      cata(onNode, onLeaf)(t.r)
    );
};

// foldR :: ((a, b) -> b) -> b -> Tree a -> b
const foldR = f => init => t => cata(
  (l, v, r) => acc => l(f(v, r(acc)),
  () => acc => acc
)(t)(init);

// foldL :: ((b, a) -> b) -> b -> Tree a -> b
const foldL = f => init => t => cata(
  (l, v, r) => acc => r(f(l(acc), v),
  () => acc => acc
)(t)(init);

Without the catamorphism, the implementations should look like this:

// foldR :: ((a, b) -> b) -> b -> Tree a -> b
const foldR = f => init => t => {
  if (t[TAG] === "Leaf")
    return init;
  else {
    const acc1 = foldR(f)(init)(t.r);
    const acc2 = f(t.v, acc1);
    return foldR(f)(acc2)(t.l);
  }
};

// foldL :: ((b, a) -> b) -> b -> Tree a -> b
const foldL = f => init => t => {
  if (t[TAG] === "Leaf")
    return init;
  else {
    const acc1 = foldL(f)(init)(t.l);
    const acc2 = f(acc1, t.v);
    return foldL(f)(acc2)(t.r);
  }
};

Neither of these have a pre-order or post-order variant, the information about the tree shape is lost to the reducing function. A linear fold always is in-order, just with different associativity between the left and right variants.

edited Nov 26 '21 at 20:00

answered Nov 26 '21 at 19:38

Bergi

630,263
148
957
1,375

nice demonstration of catamorphism. while writing out the _meta-circular evaluator_ section in my answer, i also imagined an inversion of control where `l` and `r` were functions, permitting the caller to sequence them appropriately in the callback `f`. this comes with a change of the proposed interface in the question but i think this trade is very fair. excellent post, Bergi :D – Mulan Nov 26 '21 at 20:10
1

If you want to control the order of evaluation (essentially make it lazy), I'd probably change the `onNode` callback to `(() -> b, a, () -> b) -> b` – Bergi Nov 26 '21 at 20:14
I got used to the fact that for lists fold/cata coincide. Is it a consequence of the types or is there a law that tree folds have to be in-order? – Nov 27 '21 at 10:10
Is your second `foldR` (w/o cata) a right fold in the sense of `[].reduceRight`? I think `f` should get access to its first argument prior to the recursive step. For imperative arrays, however, it doesn't make much of a difference I guess. – Nov 27 '21 at 10:18
Also `() => acc => acc` is just `() => id`. I dunno why you use a thunk there, otherwise the leaf handler would be just `id`. – Nov 27 '21 at 12:08
"*Is there a law that tree folds have to be in-order?*" - not that I know of. I guess you could also implement pre-order, post-order, reverse-pre-order, and reverse-post-order folds. If your tree was unordered, a caller couldn't tell anyway - you might as well do (pseudo-)random-order. But most trees do have an order, and it's a reasonable expectation that a fold runs in tree order. So yes, `foldR` (both of them) are equivalent to array `reduceRight`. – Bergi Nov 27 '21 at 13:38
"*the leaf handler would be just `id`*" - yes, I just didn't bother to extract that out into a separate declaration, and JS doesn't have a builtin `id` function. I used a thunk (a function with a zero-tuple argument) mainly to emphasise the similarity of the type to the `onNode` handler type (a function with a three-tuple argument), if your `Leaf` elements did contain a value that would have been passed to the `onLeaf` handler. – Bergi Nov 27 '21 at 13:41

Mulan · Answer 2 · 2022-05-19T14:00:08.943

wishful thinking

Given the various user-supplied folding functions -

traversal	user-supplied `f`
preorder	`l => v => r => [v].concat(l).concat(r)`
inorder	`l => v => r => l.concat([v]).concat(r)`
postorder	`l => v => r => l.concat(r).concat([v])`

If we want our wishes to come true, we can infer that l and r must already be arrays, otherwise Array.prototype.concat is not available.

Given your tree -

It is the nature of these folding functions, f, to flatten a compound node value into a single array value. After the first node, regardless of the f traversal chosen, the output will be [0], which is now the accumulator for the next nodes -

forked accumulator

               f([], 0, [])
             /              \
            /                \
    f([0], 1, [0])     f([0], 2, [0])
    /            \    /              \
   ...           ... ...             ...

Here we should already be able to see an issue. The accumulator has been flattened and where "left" and "right" are in relation to the zero is lost information. Worse the accumulator has been forked and we will need some merging solution to reconcile this split. How can the merging solution work without enforcing some kind of ordering to ensure its intended result?

sequenced folds

If you say, "no, we will not fork the accumulator," and instead sequence the calls to f, do we fold the left branch first? Do we fold the right branch first?

              f(..., 0, ...)
             /              \
            /                \
    f(..., 1,          f(..., 2, ...))
    /                                \
   ?               ???                ?

Even if we could manage to make these valid calls to f, selecting the left or right branch first is the ordering.

meta-circular evaluator

So we've hit a dead-end twice now. Okay, let's go back to the beginning and suppose we could change the user-supplied traversals -

traversal	user-supplied `f`
preorder	`l => v => r => [v, l, r]`
inorder	`l => v => r => [l, v, r]`
postorder	`l => v => r => [l, r, v]`

In this arrangement, l and r are no longer constrained to arrays, and instead they could be anything. They could objects, identifiers, or other type of sentinel that treeFoldL recognizes and uses to guide its growing computational process. As you know with enough engineering, we can make it do whatever we want -

For example, consider this inorder traversal -

[ L, 0, R ]
[ ...[L, 1, R], 0, R ]
[ L, 1, R, 0, R ]
[ ...[L, 4, R], 1, R, 0, R ]
[ L, 4, R, 1, R, 0, R ]
[ ...[], 4, R, 1, R, 0, R ]
[ 4, R, 1, R, 0, R ]
[ 4, ...[], 1, R, 0, R ]
[ 4, 1, R, 0, R ]
[ 4, 1, ...[L, 5, R], 0, R ]
[ 4, 1, L, 5, R, 0, R ]
[ 4, 1, ...[], 5, R, 0, R ]
[ 4, 1, 5, ...[], 0, R ]
[ 4, 1, 5, 0, R ]
[ 4, 1, 5, 0, ...[L, 2, R] ]
[ 4, 1, 5, 0, L, 2, R ]
[ 4, 1, 5, 0, ...[], 2, R ]
[ 4, 1, 5, 0, 2, R ]
[ 4, 1, 5, 0, 2, ...[L, 3, R] ]
[ 4, 1, 5, 0, 2, L, 3, R ]
[ 4, 1, 5, 0, 2, ...[], 3, R ]
[ 4, 1, 5, 0, 2, 3, R ]
[ 4, 1, 5, 0, 2, 3, ...[] ]
[ 4, 1, 5, 0, 2, 3 ]

To keep this generic, this requires that an ordering function is supplied separately from the folding function, f, which is now a familiar 2-arity interface -

const treeFoldl = ordering => f => init => t =>
  // ...

const inorder =
  treeFoldL (l => v => r => [l, v, r])

const concat = a => b =>
  a.concat(b)

const toArray =
  inorder (concat) ([])

toArray (myTree) // => [...]

const toString =
  inorder (concat) ("")

toString (myTree) // => "..."

back to planet earth

So you could go through the trouble and write treeFoldl in such a way however it overlooks the most obvious implementation, imo. From where I see it, there is no treeFoldl and treeFoldr, instead there is only preorder, inorder, postorder, levelorder, whateverorder. For purposes of associativity, there is foldl and foldr -

Here's a generic foldl that accepts any iterable, it -

const foldl = f => init => it => {
  let acc = init
  for (const v of it)
    acc = f(acc, v)
  return acc
}

And some possible traversals for your binary tree, here written as generators but you can use whatever implementation you like, stack-safe or otherwise -

function *preorder(t) {
  if (t?.[TAG] == "Leaf") return
  yield t.v
  yield *preorder(t.l)
  yield *preorder(t.r)
}

function *inorder(t) {
  if (t?.[TAG] == "Leaf") return
  yield *inorder(t.l)
  yield t.v
  yield *inorder(t.r)
}

function *postorder(t) {
  if (t?.[TAG] == "Leaf") return
  yield *postorder(t.l)
  yield *postorder(t.r)
  yield t.v
}

function *levelorder(t) {
  // pop quiz: 
  // how would you write levelorder using your original interface?
  // ...
}

Instead of tangling left/right associativity in your traversals, these things are kept separate and remain a choice of the caller. Using them is straightforward -

// left
foldl (concat) ([]) (preorder(t)) // => [...]
foldl (concat) ([]) (inorder(t))  // => [...]
foldl (concat) ([]) (postorder(t)) // => [...]

// or right
foldr (concat) ([]) (preorder(t)) // => [...]
foldr (concat) ([]) (inorder(t))  // => [...]
foldr (concat) ([]) (postorder(t)) // => [...]

// left
foldl (concat) ("") (preorder(t)) // => "..."
foldl (concat) ("") (inorder(t))  // => "..."
foldl (concat) ("") (postorder(t)) // => "..."

// or right
foldr (concat) ("") (preorder(t)) // => "..."
foldr (concat) ("") (inorder(t))  // => "..."
foldr (concat) ("") (postorder(t)) // => "..."

The advantages of separation of concerns are numerous. foldl and foldr are suitably generic to work with any sequenceable data type. And it is the domain of that data type to specify the ways it which it can be sequenced.

I fancy your out-of-the-box thinking. For ordinary lists a fold can reconstruct them and fold/catamorphism are the same. I took this for granted also for the BST type, but there fold/cata diverge and their is an information loss while folding. — , Nov 27 '21 at 09:31
it's always a fun exploration and discussion. thanks for sharing your work over the years :D — Mulan, Nov 27 '21 at 16:46

Left associative binary tree fold

2 Answers2