7

I'm learning Haskell coming from an imperative (mostly C) programming pov, with some basic understanding of Prolog. While trying to solve some exercises I've come across one that has bugged me for some time. After finding the answer provided, more doubts arose. I have to generate an infinite list of lists starting from an alphabet given as input (also known as kleene closure). The function is defined as follows:

kleene :: [a] -> [[a]]
kleene xs = []:[kle++[x] | kle <- kleene xs, x <- xs]

It works, but I do not seem to grasp how it works (which is the main point of the exercise I think). From what I understand from list comprehensions, I take the first element of the first list (here kleene xs) and then go through the elements of the second list (xs). This example shows the first 20 elements of the kleene closure.

> take 20 $ kleene [1,2,3]
> [[],[1],[2],[3],[1,1],[1,2],[1,3],[2,1],[2,2],[2,3],[3,1],[3,2],[3,3],[1,1,1],[1,1,2],[1,1,3],[1,2,1],[1,2,2],[1,2,3],[1,3,1]]

Now, what's going on here? Why the evaluation of this does not spiral out of control? Is it because of lazy evaluation, that allows to go on without really computing if not necessary? If I try to substitute and evaluate the expression, I get stuck in a loop:

[[],[kleene [1,2,3],1],[kleene [1,2,3],2],[kleene [1,2,3],3],[kleene [1,2,3],...]

To compute kleene I have to infinitely compute it. In the first row, once I resolve kleene I get

[[],[[],kleene [1,2,3],1,1],[[], kleene [1,2,3],1,2],[[], kleene [1,2,3],1,3],...]

But then it should go on for ever. Why do I get an output that is progressively bigger instead of infinitely big elements? Is it because of Lazy Evaluation, that each recursive call counts as an element of the list, so that the next element is of kle is only the subsequent recursive call, while carrying the elements of xs every time? Can someone help me out understanding this?

Micha Wiedenmann
  • 19,979
  • 21
  • 92
  • 137
Vrantamar
  • 83
  • 1
  • 7
  • 1
    Yes it is because of lazy evaluation. The first item is already accessible (the `[]` in `[] : [...]`), so the list comprehension can produce its first element, which is the second of `kleene xs`, and thus allows to further navigage through the list. As long as you don't call `kleene []`, it will each time already product an item *before* it is consumed by the recursive call. – Willem Van Onsem May 09 '23 at 11:04
  • 4
    `[[],[kleene [1,2,3],1]...` is wrong: it should be `[[],kle0++[1]...` where `kle0` is the first element of `kleene [1,2,3]`. So, `kle0=[]` and we get `[[],[1]...`. Here we discovered that the second element `kle1` is `[1]`. You can proceed with this and note that each element in the output only depends on the previous ones, so laziness works as intended and the recursion produces the whole infinite list. – chi May 09 '23 at 11:10
  • 3
    Simplifying a bit, the whole `kleene [1,2,3]` recursion works as the pseudocode `[kle0,kle1,kle2,....] = [[], kle0++[1], kle0++[2], kle0++[3], kle1++[1], kle1++[2], kle1++[3], kle2++[1], ...]`. You should be able to compute from this all the values `kle0,kle1,...`, and so the whole output. – chi May 09 '23 at 11:13
  • To be honest, given what the doubt is (i.e. all about Haskell, infinite lists, memoization, and nothing about the specific alphabet example), I think a visually simpler example would help more, such as [the Fibonacci numbers](https://stackoverflow.com/a/1105840/5825294). – Enlico May 10 '23 at 07:14
  • Thank you chi and Willem for commenting and shedding light on this. – Vrantamar May 10 '23 at 07:28

1 Answers1

4

Is it because of lazy evaluation, that allows to go on without really computing if not necessary?

More or less, yes.

If you call kleene xs it returns [] : […], so it is a "cons" (a list data object) where the head is known, an empty list, and something that still needs to be evaluated).

This thus means that it can produce the first item, without any problem. Which thus can be used in the list comprehension.

The first item for kleene xs is thus [], for any value of xs, so in the list comprehension []:[kle++[x] | kle <- kleene xs, x <- xs] it thus takes kle = [] and then starts enumerating over the lists, so if xs = [1,2], then it will produce kle ++ [1] and kle ++ [2], so [1] and [2]. Then it moves to the next item, this will force producing the next item in the recursive call, but we can, these are [1] and [2] and these can be produced without any recursive call to kleene xs. After that it thus will make an extra recursive call.

This makes it not very efficient. It will grow with (log2n) levels of recursive calls for a list with length n. We can optimize this by simply referring to the same list we are constructing. Indeed:

kleene :: [a] -> [[a]]
kleene xs = go
  where go = [] : [kle++[x] | kle <- go, x <- xs]

Now we thus only make one call and the producer of the next items of the list, will also read items from the same list. The downside of this is that it will require memory. Indeed, it will hold all the items between the current position we need for kle <- go, and the one we are producing.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
  • Thank you for taking the time to transform the previous comment into a detailed answer, this really helped me understand better. – Vrantamar May 10 '23 at 07:26