8

In Haskell I can write a self referential sequence, in GHCI, like so:

λ> let x = 1:map (+1) x
λ> take 5 x

which produces:

[1,2,3,4,5]

However my intuition on lazy evaluation says this should happen during expansion

let x = 1:map (+1) x
1:2:map (+1) x
1:2:map (+1) [1, 2] <-- substitution
1:2:2:3:map (+1) x
1:2:2:3:map (+1) [1, 2, 2, 3] <-- substitution
1:2:2:3:2:3:3:4:map (+1) x
...

This is obviously not what's happening. I can see the pattern in the correct answer. We are merely moving one element in the list at a time down an infinite stream. The pattern I recognize and I can apply it in code. However it does not line up with my mental model of lazy evaluation. It feels a bit "magic". Where is my intuition wrong?

Shoe
  • 74,840
  • 36
  • 166
  • 272
kmorris511
  • 16,392
  • 7
  • 28
  • 29
  • Forgot an `in` in your first example? – Shoe Jun 24 '14 at 00:57
  • 1
    @Jefffrey That's valid syntax for a ghci session – Benjamin Kovach Jun 24 '14 at 01:07
  • 3
    You treat `x` as if it has a different value based on how much has been evaluated. This is not the case. `x` *always* has the value `[1,2,3...]`. After all, it is a pure value and it cannot depend on time or the runtimes internal state. You can treat this as textual substitution: `x = 1:map (+1) x => x = 1:map (+1) (1:map (+1) x) => 1:map (+1) (1:map (+1) (1:map (+1) x))`, etc. – user2407038 Jun 24 '14 at 01:36
  • also, see https://stackoverflow.com/questions/10257326 – m09 Jun 24 '14 at 09:54
  • @user2407038 that's not taking memoization of thunks into account. – Will Ness Jun 25 '14 at 19:55
  • http://en.m.wikibooks.org/wiki/Haskell/Fix_and_recursion explains a lot. – dfeuer Jun 28 '14 at 17:46

5 Answers5

17

Remember only to substitute something with its definition. So whenever you expand x, you should substitute 1 : map (+1) x, not its "current value" (whatever that means).

I'll reproduce Jefffrey's idea, but with due respect for laziness.

x = 1 : map (+1) x

take 5 x
= take 5 (1 : map (+1) x)                                 -- x
= 1 : take 4 (map (+1) x)                                 -- take
= 1 : take 4 (map (+1) (1 : map (+1) x)                   -- x
= 1 : take 4 (2 : map (+1) (map (+1) x))                  -- map and (+)
= 1 : 2 : take 3 (map (+1) (map (+1) x))                  -- take
= 1 : 2 : take 3 (map (+1) (map (+1) (1 : map (+1) x)))   -- x
= 1 : 2 : take 3 (map (+1) (2 : map (+1) (map (+1) x)))   -- map and (+)
= 1 : 2 : take 3 (3 : map (+1) (map (+1) (map (+1) x)))   -- map and (+)
= 1 : 2 : 3 : take 2 (map (+1) (map (+1) (map (+1) x)))   -- take

and so on.

Exercise finish the evaluation in this style yourself (it's quite informative).

Notice how we are starting to build up a chain of maps as the list grows. If you just print x, you will see the output start to slow down after a while; this is why. There is a more efficient way, left as an exercise (and [1..] is cheating :-).

N.B. this is still a little less lazy than what will actually happen. map (+1) (1 : ...) evaluates to (1+1) : map (+1) ..., and the addition will only happen when the number is actually observed, by either printing it or e.g. comparing it.

Will Ness identified an error in this post; see the comments and his answer.

luqui
  • 59,485
  • 12
  • 145
  • 204
  • 2
    your reduction sequence doesn't take memoization into account. laziness = non-strict reduction (what you show) + *memoization*. i.e. you show the reduction for the definition `x = _Y ( (1:) . map (1+)) where _Y g = g (_Y g)` which eliminates sharing; but the actual definition is `x = fix ( (1:).map(1+)) where { fix f = x where x = f x}`. -- Consider this: in your resulting sequence `1:2:3:4:...`, 2 is result of two invocations of (1+) and 3 - of three invocations of (1+), separate of those that produced 2. But with sharing, the increments that went into 2, are reused when producing 3. – Will Ness Jun 24 '14 at 09:59
  • correction, 3 is result of two (1+) invocations, and 4 is result of separate three (1+) invocations, clearly seen on the last line of your reduction sequence. Which is not what happens (hopefully) in GHC or any other sane implementation. – Will Ness Jun 24 '14 at 10:19
  • 1
    IOW this answer is incorrect. And `[1..]` is ***not*** cheating - it is exactly equivalent, because `succ == (1+)`. :) Indeed, there's only a slowdown with `_Y ((1:).map(1+))`; with `fix ((1:).map(1+))` (or with the OP's definition) there's no slowdown. – Will Ness Jun 24 '14 at 16:21
  • 1
    @WillNess, I think you're right, thanks. I'm trying to think of a good way to present the evaluation taking this into account. – luqui Jun 24 '14 at 18:54
  • I think naming the interim entities helps, as I do in my answer. – Will Ness Jun 24 '14 at 19:13
7

Here is what happens. Laziness is non-strictness + memoization (of thunks). We can show this by naming all the interim data that come into existence as the expression is forced:

λ> let x  = 1  : map (+1) x
   >>> x  = a1 : x1                             -- naming the subexpressions
       a1 = 1
       x1 = map (+1) x 

λ> take 5 x 
==> take 5 (a1:x1)                              -- definition of x
==> a1:take 4 x1                                -- definition of take
          >>> x1 = map (1+) (1:x1)              -- definition of x
                 = (1+) 1 : map (1+) x1         -- definition of map
                 = a2     : x2                  -- naming the subexpressions
              a2 = (1+) 1                        
              x2 = map (1+) x1  
==> a1:take 4 (a2:x2)                           -- definition of x1
==> a1:a2:take 3 x2                             -- definition of take
             >>> x2 = map (1+) (a2:x2)          -- definition of x1
                    = (1+) a2 : map (1+) x2     -- definition of map
                    = a3      : x3              -- naming the subexpressions
                 a3 = (1+) a2                    
                 x3 = map (1+) x2  
==> a1:a2:take 3 (a3:x3)                        -- definition of x2
==> a1:a2:a3:take 2 x3                          -- definition of take
                >>> x3 = map (1+) (a3:x3)       -- definition of x2
.....

The elements in the resulting stream a1:a2:a3:a4:... each refer to its predecessor: a1 = 1; a2 = (1+) a1; a3 = (1+) a2; a4 = (1+) a3; ....

Thus it is equivalent to x = iterate (1+) 1. Without the sharing of data and its reuse through back-reference (enabled by the memoization of storage), it would be equivalent to x = [sum $ replicate n 1 | n <- [1..]] which is a radically less efficient computation (O(n2) instead of O(n)).

We can explicate the sharing vs non-sharing with

fix g = x where x = g x        -- sharing fixpoint
x = fix ((1:) . map (1+))      --  corecursive definition

_Y g = g (_Y g)                -- non-sharing fixpoint
y = _Y ((1:) . map (1+))       --  recursive definition

Trying to print out y at GHCi's prompt shows a marked slowdown as the progression goes along. There's no slowdown when printing out the x stream.

(see also https://stackoverflow.com/a/20978114/849891 for a similar example).

Community
  • 1
  • 1
Will Ness
  • 70,110
  • 9
  • 98
  • 181
  • This should be accepted answer and it did reveal the truth of lazy evaluation. Thank you ! – Keith Feb 24 '18 at 13:26
2

You're mapping +1 over the entire list, so that initial 1 becomes n, where n is the number of times you have lazily recursed, if that makes sense. So instead of the derivation that you're thinking of, it looks more like this:

1:...                            -- [1 ...]
1: map (+1) (1:...)              -- [1, 2 ...]
1: map (+1) (1:map (+1) (1:...)) -- [1, 2, 3 ...]

A 1 gets prepended to a lazily computed list whose elements all get incremented during each step of recursion.

So you can sort of think of the nth step of recursion as taking the list [1, 2, 3, ..., n ...], turning it into the list [2, 3, 4, ..., n+1 ...], and prepending a 1.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
Benjamin Kovach
  • 3,190
  • 1
  • 24
  • 38
  • "and prepending a `1`" this is actually a reverse of what happens. lists (as any lazy data) are fleshed out from the top down, by access. This is the foundation of [efficiency of corecursive definitions](https://en.wikipedia.org/wiki/Corecursion#Factorial), translated into iterative computations (not unlike [the TRMC optimization](https://en.wikipedia.org/wiki/Tail_call#Tail_recursion_modulo_cons)). – Will Ness Jun 24 '14 at 10:48
  • @WillNess Makes sense -- thanks! I still think it's instructive to think of it in this way (though perhaps my sentence about prepending initially was incorrect), but I can see the difference. – Benjamin Kovach Jun 24 '14 at 13:27
  • you meant it semantically, as a way to understand/explain the definition. :) – Will Ness Jun 24 '14 at 13:31
  • May I suggest that you replace, e.g., the comment `"[1, 2]"` with the comment `[1, 2, ...]` or `[1, 2, ???]` or even just `[1, 2, ...`? These are not completed lists and shouldn't be represented so. – dfeuer Jun 25 '14 at 17:23
2

Let's look at this a bit more mathematically. Suppose that

x = [1, 2, 3, 4, ...]

Then

map (+1) x = [2, 3, 4, 5, ...]

so

1 : map (+1) x = 1 : [2, 3, 4, 5, ...] = x

This (turned around) is the equation we started with:

x = 1 : map (+1) x

So what we've shown is that

x = [1, 2, 3, 4, ...]

is a solution of the equation

x = 1 : map (+1) x   -- Eqn 1

The next question, of course, is whether there are any other solutions to Eqn 1. The answer, as it turns out, is no. This is important because Haskell's evaluation model effectively chooses the "least-defined" solution of any such equation. For example, if we instead defined x = 1 : tail x, then any list starting with 1 would be a solution, but we would actually get 1 : _|_, where _|_ represents an error or non-termination. Eqn 1 does not lead to this sort of mess:

Let y be any solution of Eqn 1, so

y = 1 : map (+1) y

Note that we can tell from the definition that

take 1 y = [1] = take 1 x

Now suppose

take n y = take n x

Then

take (n+1) y = take (n+1) (1 : map (+1) y)
             = 1 : take n (map (+1) y)
             = 1 : map (+1) (take n y)
             = 1 : map (+1) (take n x)
             = 1 : take n (map (+1) x)
             = take (n+1) (1 : map (+1) x)
             = take (n+1) x

By induction, we find that take n y = take n x for each n. That is, y = x.

dfeuer
  • 48,079
  • 5
  • 63
  • 167
0

What you have got wrong

In your order of evaluation:

let x = 1:map (+1) x
1:2:map (+1) x
1:2:map (+1) [1, 2] <-- here

you are making a wrong assumption. You are assuming x is [1, 2] because that's the number of elements you can see there. It's not. You have forgot to consider that the x at the end needs to be calculated recursively.

The actual flow

The x at the end of the sequence needs to be computed by recursively computing itself. Here's the actual flow:

take 5 $ 1:map (+1) ...
take 5 $ 1:map (+1) (1:map (+1) ...
take 5 $ 1:map (+1) (1:map (+1) (1:map (+1) ...
take 5 $ 1:map (+1) (1:map (+1) (1:map (+1) (1:map (+1) ...
take 5 $ 1:map (+1) (1:map (+1) (1:map (+1) (1:map (+1) (1:map (+1) ...
take 5 $ 1:map (+1) (1:map (+1) (1:map (+1) (1:map (+1) [1 ...
take 5 $ 1:map (+1) (1:map (+1) (1:map (+1) [1, 2 ...
take 5 $ 1:map (+1) (1:map (+1) [1, 2, 3 ...
take 5 $ 1:map (+1) [1, 2, 3, 4 ...
take 5 $ [1, 2, 3, 4, 5 ...
[1, 2, 3, 4, 5]
Community
  • 1
  • 1
Shoe
  • 74,840
  • 36
  • 166
  • 272
  • Helpful expansion, though it's misleading to call this the "order of evaluation". This doesn't correspond to the order in which things will actually be calculated. – luqui Jun 24 '14 at 01:14
  • you are forcing the whole tail before prepending; this won't work on infinite lists (while in ghci you can see that it does). See my answer. – luqui Jun 24 '14 at 01:25
  • laziness means thunks are memoized; check out my answer, where I show it by naming the interim entities. – Will Ness Jun 24 '14 at 12:24
  • The error is one line *before* what you've marked. `1 : map (+1) x` does *not* expand to `1 : 2 : map (+1) x`. Rather, it expands, essentially, to `1 : map (+1) (1 : tail x)`, which expands to `1 : 2 : map (+1) (tail x)`, which expands to `1 : 2 : map (+1) (2 : tail (tail x))`, etc. – dfeuer Jun 24 '14 at 22:29
  • @WillNess, right. I was only explaining it from a correctness perspective, not an efficiency perspective. – dfeuer Jun 25 '14 at 17:20