3

I have some training in the lisp family of languages and I am now learning a bit of Haskell for my own good. In lisp, functional style is ok but there are a few cases where imperative style is necessary to get decent performance, e.g.

  • append

    Append is slow since it must copy its first argument (sometimes x100 as slow as versions that succeed to get rid of it). A remedy is to move the last pointer of the first list to the beginning of the second list, instead of appending. Of course this is a destructive operation.

  • sort

    Functional versions of quicksort create many intermediate lists, which somehow defeats the purpose of the algorithm, which is to be as quick as possible. AFAIR, in common lisp, sort is the only destructive function that does not have a functional version.

  • length

    This is a costly operation in lisp since one must go down the whole list to get its length. This needs not be so, afaik clojure computes length of list in logarithmic time. The solution is often to compute the length on the fly in an imperative loop.

My question is, how do we deal with these problems in haskell ?

stackman
  • 360
  • 1
  • 9
  • 13
    The problems you describe all seem to stem from a poor choice of data structure (cons lists where you should use queues/deques/etc.; data structure which doesn't maintain a length field when you need the length) or algorithm (quicksort instead of a sort that works better on persistent lists) rather than from functional style. –  Aug 04 '13 at 17:51

2 Answers2

11

As delnan said, these problems are with using the wrong data structure, such as a linked list when you want a vector.

Your particular operations

  • append: cons is O(1), so I suspect you are referring to the act of allocating a cons cell, pattern matching on the cell, and eventual GC of the cell. This is rather unavoidable unless you optimize away the list, which does happen in certain situations.

  • sort: I suggest you look at the ST monad, which allows you to use algorithms that require mutation inside a pure context. See the vsort implementation for an example.

  • length: Sure, the only way to get the length of a linked list is to traverse the list. If you need the length often then consider using a Set, Map, or Vector - all of which record a total size that can be accessed in O(1).

In General

  • Look into ST for mutability.
  • Use the right data structure for the right problem. Learn what structures Hackage has to offer (vectors, arrays, hashmaps, hashtables, bloomfilters, and more).
Community
  • 1
  • 1
Thomas M. DuBuisson
  • 64,245
  • 7
  • 109
  • 166
  • To clarify, as an example of imperative optimisation, mappend is the (reverse) bind operator in the list monad. In lisp you can switch to its destructive version, mapcan, and get a x3 boost in execution speed. This is not related to lists being the wrong data type. Append is slow in lisp (but maybe not that slow in haskell? I get decent performance from the haskell list monad, a bit surprisingly). – stackman Aug 04 '13 at 18:45
  • 6
    I'm still not clear on what you're hoping for. If you want destructive updates you have a way to get it (and structures that are mutable, tying in with the notion of using the right data structure). If you just saying 'I suspect this operation is slow because it's slow in Lisp and would pre-emptively like an alternative' then I think you are being too hasty and should investigate the actual performance first. – Thomas M. DuBuisson Aug 04 '13 at 20:43
3

append isn't a universal thing. Appending on to a double-ended queue is different from appending on to a cons list. Appending destructively is different from copy-on-append. Depending on your criteria, you may optimize for slowness or thread safety or simplicity and your definition of "problem" will change.

As delnan and Thomas DuBuisson mention Haskell has all of these options if you pick the right data type.

So if you have a specific algorithm you'd like to optimize, there's likely to be either a method to translate it to a fast non-destructive version, an option to simulate destructions at usually a multiplicative log(n) slowdown, or an option to run referentially transparent destructions at an additive constant slowdown.

For good examples of trouble with this translation look at algorithms for persistent union-find or the Functional Graph Library.

J. Abrahamson
  • 72,246
  • 9
  • 135
  • 180