Infinite lists that depend on each other in Haskell?

Question

I am working on a programming exercise where the goal is to write a function to get the term at the Nth index of Hofstadter's Figure-Figure sequence.

Rather come up with a basic solution using the formula, I thought it would be an interesting challenge to generate an infinite list to represent the sequence and then index it.

This was my initial approach, however, it hangs when trying to calculate anything past the first two terms.

hof :: Int -> Int
hof = (!!) seqA
  where
    seqA = 1 : zipWith (+) seqA seqB
    seqB = 2 : filter (`notElem` seqA) [3..]

seqA represents the sequence of terms, and seqB is the differences between them.

Though I don't really understand how to use seq, I tried using it to strictly evaluate the terms that come before the desired one, like shown below.

hof :: Int -> Int
hof 0 = 1
hof n = seq (hof $ n - 1) $ seqA !! n
  where
    seqA = 1 : zipWith (+) seqA seqB
    seqB = 2 : filter (`notElem` seqA) [3..]

This also hangs when trying to calculate values past the first index.

After playing around in ghci, I found a way to get this to work in a weird way

ghci> seqB = [2, 4, 5, 6]
ghci> seqA = 1 : zipWith (+) seqA seqB
ghci> seqB = 2 : filter (`notElem` seqA) [3..]
ghci> seqA = 1 : zipWith (+) seqA seqB
ghci> hof = (!!) seqA

By giving seqB and initial value and redefining both seqA and seqB afterwards, it seems to function normally. I did notice, however, that the result of passing larger values to hof seems to give different results based on how many terms I initially put in the seqB list. When I redefine the function in ghci, does it still use the older version for functions that call it previous to its redefinition?

I would like to know why this works in ghci and whether it's possible to write a working version of this code using a similar technique. Thanks in advance!

score 1 · Accepted Answer · answered Feb 10 '23 at 03:08

The problem is that seqA is infinite, and so

(`notElem` seqA) x

can never return True. If it sees that x is the first element of seqA, then great: it can return False. But if it doesn't see x, it wants to keep looking: maybe x is the next element! The list never ends, so there's no way it can conclude x is definitely not present.

This is a classic mistake beginners make, trying to filter an infinite list and expecting the list to end at some point. Often, the answer is to use something like

x `notElem` (takeWhile (<= x) infList)

instead. This way, your program gives up on searching for x once it's found a number above x. This only works if your lists are sorted, of course. Your equations look like they probably produce ascending lists, in which case it would work, but I haven't worked through the algebra. If your lists aren't in ascending order, you'll need to design some other stopping condition to avoid the infinite recursion.

@DanielWagner I don't think there's an easy way to do so, since `seqA` increases faster than `seqB` does. I suppose a list of values to check could be held as a state somewhere, but arguably that's even less clean. — spectre256, Feb 10 '23 at 04:43

Daniel Wagner · Answer 2 · 2023-02-10T04:50:20.780

The other answer tells you the problem with your approach, and suggests a great fix. I thought it might be fun to try to work out a slightly more efficient solution, though; it seems a shame to keep checking the beginning of seqA over and over during our membership calls. Here's the idea I had: the point is for seqB to be the complement of seqA, right? Well, what if we just directly define a complement function? Like this:

complement :: Integer -> [Integer] -> [Integer]
complement = go 1 where
    go i xs@(x:xt) = case compare i x of
        LT -> i : go (i+1) xs
        EQ -> i+1 : go (i+2) xt
        GT -> go i xt -- this case should be impossible
    go i [] = [i..] -- this case is irrelevant for our purposes

The EQ case is a bit suspect; it doesn't work for general increasing input sequences. (But see below.) Anyway, with this definition in place, the two sequences can be quite naturally defined:

seqA, seqB :: [Integer]
seqA = 1 : zipWith (+) seqA seqB
seqB = complement seqA

Try it in ghci:

> take 10 seqA
[1,3,7,12,18,26,35,45,56,69]

Nice. Now, if we fix up the EQ case to work properly for all (increasing) input sequences, it would have to look like this:

complement :: Integer -> [Integer] -> [Integer]
complement = go i where
    go i xs@(x:xt) = case compare i x of
        LT -> i : go (i+1) xs
        EQ -> go (i+1) xt
        GT -> go i xt -- still impossible
    go i [] = [i..] -- still irrelevant

Unfortunately, our definitions of seqA and seqB above don't quite work any more. The right first value for seqB depends on whether 2 is in seqA, but whether 2 is in seqA depends on whether the first value of seqB is 1 or not... Luckily, because seqA grows much faster than seqB, we only have to prime the pump a little.

seqA, seqB :: [Integer]
seqA = 1 : 3 : 7 : zipWith (+) (drop 2 seqA) (drop 2 seqB)
seqB = complement seqA
-- OR
seqA = 1 : zipWith (+) seqA seqB
seqB = 2 : 4 : drop 2 (complement seqA)

Try it in ghci:

> take 10 seqA
[1,3,7,12,18,26,35,45,56,69]

The definition of seqX is a bit less natural, but the definition of complement is a bit more natural, so there seems to be something of a tradeoff there.

The "impossible" case is quite possible if you can think of an `Integer` that's less than 1. Not relevant here, of course, but it seems weird to call it out as impossible when the empty-list case is only called out as irrelevant. — amalloy, Feb 10 '23 at 07:27

score 0 · Answer 3 · answered Feb 10 '23 at 05:02

As an answer to this part:

When I redefine the function in ghci, does it still use the older version for functions that call it previous to its redefinition?

Yes, that's the way it has to work. Bindings at the ghci prompt are not mutable variables as you would have in an imperative language, they're supposed to work the same way as variables do in every other part of Haskell.

So when you have this:

ghci> a = 1
ghci> b = [a]
ghci> b
[1]

a is just a name for 1, and b is just a name for [1]. The latter was calculated by from the expression [a] by seeing what value a was a name for, but it is absolutely the value [1] and not the expression [a] that b refers to.

ghci> a = 2
ghci> b
[1]

Executing a = 2 doesn't change the value referred to by a, it just changes the state of the environment available at the ghci prompt. This cannot affect any values that were calculated when a was a name for 1; they were and remain pure values.

An easy way to think about it is that a = 2 is not "changing a", it's just introducing a new and separate binding. Because it happens to have the same name as an existing one the new one shadows the old one, making the old one impossible for you to refer to in any future expressions. But nothing about the old one has been changed.

And you will in fact see exactly the same behaviour in a compiled module in contexts where you can have multiple bindings for one name (if you shadow a function argument with a let, or nest lets, etc). All but one of them will be inaccessible, but things that were defined in terms of the shadowed binding remain exactly the same; they aren't re-evaluated as if they were defined in terms of the new binding.

So with that in mind, it becomes easy to explain why this works:

ghci> seqB = [2, 4, 5, 6]
ghci> seqA = 1 : zipWith (+) seqA seqB
ghci> seqB = 2 : filter (`notElem` seqA) [3..]
ghci> seqA = 1 : zipWith (+) seqA seqB
ghci> hof = (!!) seqA

It's much the same as if you had defined it this way:

ghci> seqB_old = [2, 4, 5, 6]
ghci> seqA_old = 1 : zipWith (+) seqA_old seqB_old
ghci> seqB_new = 2 : filter (`notElem` seqA_old) [3..]
ghci> seqA_new = 1 : zipWith (+) seqA_new seqB_new
ghci> hof = (!!) seqA_new

seqB_old is just a finte list
Because zipWith stops at the length of the shortest list, seqA_old is also just a finite list, even though it's defined in terms of itself.
seqB_new is an infinite list that just has to filter each element against any of the elements of the finite list seqA_old; this doesn't get caught up in the problem amalloy points out, but it isn't actually the correct list you were trying to define
seqA_new is defined in terms of itself, but seqB_new was defined in terms of seqA_old, not this new version. There is simply no mutual recursion happening.

score 0 · Answer 4 · answered Feb 10 '23 at 06:54

This problem doesn’t really lend itself to a mutually recursive solution. filter + notElem will continue searching beyond where they could ever return a result, because they can’t make any use of the fact that the sequence is strictly ascending.

Rather than searching for the next element that we haven’t seen, we can turn the problem around: start by assuming we will see every number, and use delete to prune out those numbers that we know we will want to exclude.

hof :: Int -> Int
hof = (!!) seqA
  where

    -- By definition, one is the cumulative sum of the other.
    seqA = scanl' (+) 1 seqB

    -- Iteratively build the sequence.
    seqB = unfoldr (infinitely step) (1, [2 ..])
    step c (d, xs) = (c, (c + d, delete (c + d) xs))

    -- Helper for when ‘unfoldr’ is known to have
    -- unbounded input (‘x : xs’ always matches)
    -- and unbounded output (we always return ‘Just’).
    infinitely f (d, x : xs) = Just (f x (d, xs))

Infinite lists that depend on each other in Haskell?

4 Answers4