6

Haskell caches results of pure function calls, one of the many reasons for the separation between pure and impure behavior. Yet this function, which should run in O(n) where n is 50 below, runs really slowly:

lucas 1 = 1
lucas 2 = 3
lucas n = lucas (n-1) + lucas (n-2)
map (lucas) [1..50]

The first thirty terms or so together compute in under a second, then 31 takes half a second or so, 32 takes a full second, 33 takes a few seconds, 34 takes 6 seconds, 35 takes 11 seconds, 36 takes 17 seconds...

Why is this function so slow? If the GHC interactive runtime is caching results, then each call to lucas should only involve summing of the two previous cached terms. One addition operation to find term 3, one additional addition to find term 4, one additional addition to find term 5, so term 50 is reached with only 48 additions total considering caching. The function shouldn't take anywhere near a second to find at least the first few thousand terms, why is the performance so terrible?

I've been waiting for more than half an hour now for lucas 500 to compute.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
TheEnvironmentalist
  • 2,694
  • 2
  • 19
  • 46
  • 2
    there is no caching - at least not in the way you expect - this will behave the same as in any other language – Random Dev Jan 25 '16 at 05:24
  • 3
    The behavior is not *O(n^2)*, but *O(2^n)*. – Willem Van Onsem Jan 25 '16 at 05:49
  • 1
    @WillemVanOnsem hahahaha, I see I was a bit too optimistic – TheEnvironmentalist Jan 25 '16 at 05:51
  • anyway both versions I mentioned should get you the first 500 numbers in a few seconds (basically the time it takes to print them on screen) ;) – Random Dev Jan 25 '16 at 06:02
  • 3
    As an aside, you say "Haskell caches results of pure function calls" with confidence. Why? Is there a source out there somewhere that we can correct? So very many people seem to have this misconception somehow... – Daniel Wagner Jan 25 '16 at 08:13
  • I believe it may have been written in Learn You a Haskell – TheEnvironmentalist Jan 25 '16 at 08:15
  • @TheEnvironmentalist Where? – Daniel Wagner Jan 25 '16 at 08:35
  • You probably misinterpreted what *sharing* is about. If you have `let x = some_function a b c in [x,x,x]` then the `x` value is computed only once. Lazyness means that `x` is computed only when something tries to "look at" one of the elements of that list. Once this is done the value is computed and all 3 elements will reference it. A lazy language without sharing would compute the value only when the element is inspected, but accessing two elements would cause two function calls to compute the values. – Bakuriu Jan 25 '16 at 09:03

1 Answers1

7

The reason your version is very slow is that there is no memoization of the lucas (n-1) and lucas (n-2) parts - so both values will be recalculated (recursively) over and over again - which is painfully slow.

The solution is to keep the calculated values somewhere:

using list-lazyness

Here is a simple version that will do the same as your code-snippet but should be way faster - it will keep the already calculated parts in the list itself:

lucasNumbers :: [Integer]
lucasNumbers = 1:3:zipWith (+) lucasNumbers (tail lucasNumbers)

first50 :: [Integer]
first50 = take 50 lucasNumbers

the reason why this is faster is that now the lazyness of the list will help you memoize the different parts

You can learn much about this if you look for Fibonacci sequences in Haskell (it's really just the same as yours ;) )

using unfoldr

another (maybe less magic seeming) way to do this is using Data.List.unfoldr - here the already calculated parts (or those that matter - the last and second-to-last elements) will be in the state you pass around for the unfold operation:

lucasNumbers :: [Integer]
lucasNumbers = unfoldr (\ (n,n') -> Just (n, (n',n+n'))) (1,3)

to your comment/question:

assuming you are talking about x(n) = x(n-1)^2-2 then you can do it like this:

lucasLehmer :: [Integer]
lucasLehmer = 4 : map (\ x -> x^2-2) lucasLehmer

which will yield something like this:

λ> take 5 lucasLehmer
[4,14,194,37634,1416317954]

maybe you should try the unfoldr version yourself

Random Dev
  • 51,810
  • 9
  • 92
  • 119
  • How would you do this with list operations that only apply to the previous term, like the Lucas-Lehmer numbers, where the first is 4 and each subsequent term is equal to two less than the previous term squared? – TheEnvironmentalist Jan 25 '16 at 05:34
  • 1
    It's even easier in that case. You can do it with `unfoldr`, but `iterate` is written for exactly this purpose. – amalloy Jan 25 '16 at 07:53
  • @amalloy yes of course (for the `lucasLehmer` sequence) that's true ... this and maybe hundred other ways to do it ;) - I did not want to add yet another way (which will not work for the original question - or not easily - anyway) – Random Dev Jan 25 '16 at 08:09