3

In the same scenario as my previous question, I was trying to implement the cycle function only¹ using a fold, when I came up with the following wrong function, which tries to concatenate the accumulator with itself, building the infinite list exponentially (yes, I know this would mean that it generates 2048 copies if one wants to take 1025 of them)

myCycle :: [a] -> [a]
myCycle s = foldr (\_ a -> a ++ a) s [1..]

However, using it throws *** Exception: heap overflow.

This version, instead, works like a charm

myCycle :: [a] -> [a]
myCycle s = foldr (\_ a -> s ++ a) s [1..]

My question is, why does the former version overflow, as compared to the latter? I feel the reason is dumber than me...

[1] I mean, implementing cycle as a fold, having only the step function and the seed as degrees of freedom.

Enlico
  • 23,259
  • 6
  • 48
  • 102
  • Because it is infinite recursion, so Haskell never reaches a point where it can "emit" the first element. – Willem Van Onsem Apr 29 '20 at 18:24
  • Does it really say "heap overflow"? I expected a stack overflow, and that's what I got when I tried it in GHCI. – amalloy Apr 29 '20 at 18:25
  • 2
    @WillemVanOnsem As you know, lots of perfectly well-behaved Haskell functions use infinite recursion. An answer that justified why in this case it doesn't work would be much more helpful. – amalloy Apr 29 '20 at 18:27
  • @amalloy, yes, it does, and _why the heck do I get heap overflows instead of stack overflows_ is on my to-ask list. – Enlico Apr 29 '20 at 18:27
  • 2
    @EnricoMariaDeAngelis: infinite recursion is not the problem itself, as long as the evaluation reaches a cons `(:) h t` where `t` might result in more recursion, etc. Since then when obtaining the first (and other) elements, it will stop at that point. If however you write `a -> a ++ a`, it will never get to such point, since `a` will each time be replaced with a the concatenation of two lists. – Willem Van Onsem Apr 29 '20 at 18:37
  • Well, for my part, I get `Non type-variable argument in the constraint: Num [a]`, and not any sort of exception at all. – A. R. Apr 29 '20 at 18:38
  • @WillemVanOnsem, I think the concept is sitting midway between my skull and my brain. I hope you or someonelse can help me push it in with a thorough answer. – Enlico Apr 29 '20 at 18:39
  • 3
    The first one is similar to `f s = f s ++ f s`, which will never produce the first element, even if you keep on unrolling the recursive definition (`s` is passed around but is never fed to `++` or used in any way). The second one is more like `f s = s ++ f s` which will immediately produce the first element (unless `s==[]`). – chi Apr 29 '20 at 19:15
  • BTW, exponential cycling via higher-order functions is e.g. `expoCycle = concat . iterate (\a -> a ++ a)`. – Will Ness Apr 30 '20 at 10:33

1 Answers1

6

foldr c n takes a list and replaces every (:) with c and the final [] with n. But [1..] has no final [], so foldr (\_ a -> a ++ a) s has no place to put the s. Therefore no information ever "flows" from s to the result of myCycle s, which means it has no choice but to be bottom (or rather: it has too much choice—it's underspecified—so it gives up and falls back to bottom). The second version actually does use s, because it appears in the folding function, which is used when foldr acts on an infinite list.

In fact, we have the identity

foldr (\_ -> f) x xs = fix f = let x = f x in x

when xs is infinite. That is, the second argument of foldr is completely ignored when the list is infinite. Further, if that folding function doesn't actually look at the elements of the list, then all that's really happening is you're infinitely nesting f within itself: fix f = f (f (f (f ...))). fix is fundamental in the sense that every kind of recursion can be written in terms of it (certain more exotic kinds of recursion require adding some language extensions, but the definition fix f = let x = f x in x itself doesn't change). This makes writing any recursive function in terms of foldr and an infinite list trivial.

Here's my take on an exponential cycle. It produces 1 copy of the input, concatenated onto 2 copies, concatenated onto 4, etc.

myCycle xs = xs ++ myCycle (xs ++ xs)

You translate an explicitly recursive definition to fix by abstracting the recursive call as a parameter and passing that to fix:

myCycle = fix \rec xs -> xs ++ rec (xs ++ xs)

And then you use the foldr identity and introduce a bogus [] case

myCycle = foldr (\_ rec xs -> xs ++ rec (xs ++ xs)) (error "impossible") [1..]
HTNW
  • 27,182
  • 1
  • 32
  • 60
  • Is the ninth letter in your answer a typo? – Enlico Apr 29 '20 at 19:30
  • 1
    Whoops. The mnemonics are supposed to mean `c`ons and `n`il, but I used `s` later in the sentence. The first sentence is a general statement about `foldr`, so I didn't want to use `s` since that's just this instance. – HTNW Apr 29 '20 at 19:46
  • +1. I need some time to digest the second part of your answer. Concerning the first part, when you write _it has too much choice—it's underspecified_, that makes me think of the linear algebra-like concept of indetermination when you have more degrees of freedom (e.g. variables) than constraints (e.g. equations). How would you cast that sentence/explanation of your in these terms? What are the to few constraints and what the degrees of freedom, in your first paragraph, if it makes any sense? I'm just trying to understand every single bit of your answer; a bit at a time. – Enlico Apr 29 '20 at 21:08
  • Your first `myCycle` is basically `myCycle _ = let a = a ++ a in a`. There are *many* valid solutions to the equation for `a`: any infinite list is a solution, plus `[]`. Importantly, `undefined` is also a solution. The language defines an order on values by "definedness" or "information content". For lists, `undefined < []` and `undefined < undefined : undefined`. (NB: This `<` isn't itself a part of Haskell.) When an equation has multiple solutions, the "least" or "most undefined" one is chosen. "Safe" recursion boils down to making equations strong enough that `undefined` isn't a solution. – HTNW Apr 29 '20 at 22:22
  • @EnricoMariaDeAngelis *`myCycle _s = foldr (const f) undefined [1..] = f (foldr (const f) undefined [1..])`* = (turning equality into definition) = *`let { a = f a } in a = let { a = (\a -> a ++ a) a } in a = let { a = a ++ a } in a`*. also, by definition of `fix`, *`let { a = f a } in a = fix f = fix (\a -> a ++ a)`*. by definition of `++`, *`take 1 a = take 1 (a ++ a) = case a of { [] -> take 1 a ; (x:_) -> [x] } = case (a ++ a) of { [] -> take 1 a ; (x:_) -> [x] }`* and so accessing `a ++ a`, even 1 element of it, means accessing `a ++ a`, a *vicious circle*. (also, "left recursion"). – Will Ness Apr 30 '20 at 10:27
  • @WillNess, this would probably puzzle me written in an answer, but in a comment it gives me a headache, ahahaha. – Enlico Apr 30 '20 at 11:50
  • 1
    @EnricoMariaDeAngelis and I was trying to clarify things for you... fail on my part! :) but do try writing them line under the previous line to see what changes on each step, easier. – Will Ness Apr 30 '20 at 12:37
  • No worries, @WillNess, it will certainly help me, and I appreciate it. I was just joking on the formatting issue ;) – Enlico Apr 30 '20 at 12:39