35

After writing this article I decided to put my money where my mouth is and started to convert a previous project of mine to use recursion-schemes.

The data structure in question is a lazy kdtree. Please have a look at the implementations with explicit and implicit recursion.

This is mostly a straightforward conversion along the lines of:

data KDTree v a = Node a (Node v a) (Node v a) | Leaf v a

to

data KDTreeF v a f = NodeF a f f | Leaf v a

Now after benchmarking the whole shebang I find that the KDTreeF version is about two times slower than the normal version (find the whole run here).

Is it just the additional Fix wrapper that slows me down here? And is there anything I could do against this?

Caveats:

  • At the moment this is specialized to (V3 Double).
  • This is cata- after anamorphism application. Hylomorphism isn't suitable for kdtrees.
  • I use cata (fmap foo algebra) several times. Is this good practice?
  • I use Edwards recursion-schemes package.

Edit 1:

Is this related? https://ghc.haskell.org/trac/ghc/wiki/NewtypeWrappers Is newtype Fix f = Fix (f (Fix f)) not "free"?

Edit 2:

Just did another bunch of benchmarks. This time I tested tree construction and deconstruction. Benchmark here: https://dl.dropboxusercontent.com/u/2359191/2014-05-15-kdtree-bench-03.html

While the Core output indicates that intermediate data structures are not removed completely and it is not surprising that the linear searches dominate now, the KDTreeFs now are slightly faster than the KDTrees. Doesn't matter much though.

Robin Green
  • 32,079
  • 16
  • 104
  • 187
fho
  • 6,787
  • 26
  • 71
  • 1
    I would benchmark that against `data KDTree v a = Node a (Identity (Node v a)) (Identity (Node v a)) | Leaf v a`, just to see if the extra indirection is the culprit. Make sure `Identity` is not a `newtype`. – chi May 14 '14 at 20:24
  • https://dl.dropboxusercontent.com/u/2359191/2014-05-14-kdtree-bench-02.html now the *_nn* tests are wrapped in `data Identity a = MkId { unId :: a}` as you suggested. I'd say it is slower now ... but not as slow as the morphism version. – fho May 14 '14 at 20:46
  • Whoever voted to close this question: care to elaborate? – fho May 19 '14 at 09:01

2 Answers2

17

I have just implemented the Thing + ThingF + Base instance variant of the tree. And guess what ... this one is amazingly fast.

I was under the impression that this one would be the slowest of all variants. I really should have read my own post ... the line where I write:

there is no trace of the TreeF structure to be found

Let the numbers speak for themselves, kdtreeu is the new variant. The results are not always as clear as for these cases, but in most cases they are at least as fast as the explicit recursion (kdtree in the benchmark).

Benchmark results

Howli
  • 12,291
  • 19
  • 47
  • 72
fho
  • 6,787
  • 26
  • 71
  • 1
    By now I accelerated the kdtreeu_nn5 case too, turns out that the `cata (fmap foo algebra)` pattern is not useful in this case. – fho May 19 '14 at 09:03
  • When you finalize this last variant, will you push it to github or make it available somehow? – chi May 19 '14 at 09:07
  • This is still WIP, but the concept should be clear here: https://github.com/fhaust/kdtree/blob/master/src/Data/KDTree.hs#L52-61 – fho May 19 '14 at 09:39
  • Aaand ... I just remembered that I merged `KDTreeU` into `KDTree`. If you want comparative benchmarks you'll have to benchmark a current vs one of the old versions. And the corresponding lines are: https://github.com/fhaust/kdtree/blob/master/src/Data/KDTree.hs#L137-L147 – fho May 19 '14 at 09:46
1

I wasn't using recursion schemes, but rather my own "hand-rolled" cata, ana, Fix/unFix to do generation of (lists of) and evaluation of programs in a small language in the hope of finding one that matched a list of (input, output) pairs.

In my experience, cata optimized better than direct recursion and gave a speed boost. Also IME, ana prevented stack overflow errors that my naive generator was causing, but that make have centered around generation of the final list.

So, my answer would be that no, they aren't always slower, but I don't see any obvious problems; so they may simply be slower in your case. It's also possible that recursion-schemes itself is just not optimized for speed.

  • Just did a last test run against a hand rolled cata. Still not as fast as the implicit one nor the one from `recursion-schemes`. So I think I just give up at this point and concentrate on different things. Thanks for your answer :) – fho May 17 '14 at 07:08