Haskell: iterate in State, how to force the behaviour I want?

Question

This is my first posting on SO, and I'm relatively new to Haskell, so please excuse any missteps or if my code is not idiomatic!

Consider the following two intuitive descriptions of: a, f(a), f(f(a))...

A. a list containing: a, the application of f to a, the application of f to that, the application of f to that...

B. a list containing, in the ith position, i nested applications of f to a.

My issue is that I got burnt trying to use the iterate function in Haskell to do A. My real application is a simulation, but the following contrived example highlights the problem.

import Control.Monad.State

example :: State Int [[String]]

step :: [String] -> State Int [String]
step l = do
         currentState <- get
         let result = if (currentState == 1)
                          then "foo":l
                          else "bar":l
         put (currentState + 1)
         return result

example = do
          sequence $ take 3 . iterate (>>= step) $ return []

With these definitions,

evalState example 1

results in:

[[],["foo"],["bar","bar"]]

Clearly, iterate does B, not A! Because the step function only ever adds something to the input list, step ["foo"] could not possibly result in ["bar", "bar"], no matter what the state!

Let me say that I do understand what is going on here, and also that - formally - the result is exactly "as it should be": step is a stateful function, so when f(a) comes up for evaluation as part of f(f(a)), it will be recomputed rather than taken from the second list item because the state has changed. I also realize I could avoid this in my real-life application by putting my accumulating list inside the state.

Nevertheless, there are two reasons for posting this.

First, the point is that iterate is frequently explained in a way that may potentially mislead a beginner to think it does A, when it actually does B. This includes Learn You A Haskell (which I otherwise found incredibly useful), but also post on SO (here and here, for example). In fact, the verbal explanation of iterate in LYAHFGG is almost exactly definition A above. So it might be useful to have a post on this that as a resource for other Haskell newbies who get a bug because of this and are searching for an explanation (so by all means do post more accurate, technical, better phrased, clarifications on the difference between A and B below).

Second, I would still be interested whether there is a function that will, in fact, do A! In other words, how can I, in the above stateful example, produce the list (with slight abuse of notation): [a, b = f(a), f(b), ...]? In other words, given

example2 = do
           firstResult <- step []
           secondResult <- step firstResult
           return $ [[], firstResult, secondResult]

for which

evalState example2 1

yields the desired result

[[],["foo"],["bar","foo"]]

How can I rewrite example2 using iterate?

On the beginners Haskell list, a related question regarding a memoizing version of iterate was posted. However, that query does not appear to have received an answer.

I'm not entirely sure laziness is really the problem in my application. Would a strict version of iterate do what I want? My own, naive, 'strict iterate' as below does not appear to make any difference.

iterate' f x = x : rest
               where previous = f x
                     rest = previous `seq` iterate f previous

Any insights on all of this would be much appreciated!

Daniel Fischer · Accepted Answer · 2011-10-28T15:11:13.637

17

There is no difference between A. and B., they are the same thing by referential transparency.
The core of the problem seems to be that you're interpreting them in the context of execution of stateful computations. In that context, the analogue of A that you're expecting is
A': Produce a result list by 1. putting the result of the initial computation into the list, 2. determine the next computation from the previous result, execute it and append its result to the list, 3. goto 2.
The analogue of B is
B': Produce a list of computations (by iterating (>>= step)) and from that a result list by executing the computations one after the other.
For stateless computations, or when you pass the same initial state to all computations produced in B', the only difference is in efficiency, but if you're using sequence, each computation starts with a different state, so you get different results from A'. Breaking down your example, we have

actionList = take 3 $ iterate (>>= step) (return [])
           = [return [], return [] >>= step, return [] >>= step >>= step]

a list of actions (or monadic values) in State Int [String]. Now, when you apply sequence to that,

example = sequence actionList

you get an action that when executed, runs the first of these actions with the initial state, the second with the state as updated by the first, and the third with the state as updated by the second. To get the behaviour you expect, they all would have to be run with the same initial state.

Basically, a value of type State s v is a function of type s -> (v, s). iterate creates a list of functions, and sequence applys these functions, supplying different s arguments to them (each gets the s produced by the previous).

To get the desired behaviour, we must introduce a new combinator. Very nice, but usable in only very few Monads is

iterateM :: Monad m => (a -> m a) -> m a -> m [a]
iterateM step start = do
    first <- start
    rest <- iterateM step (step first)
    return (first:rest)

Which produces

Prelude Control.Monad Control.Monad.State> evalState (fmap (take 4) (iterateM step (return []))) 1
[[],["foo"],["bar","foo"],["bar","bar","foo"]]

But it works only in monads with sufficiently lazy (>>=), Control.Monad.State.Lazy is one of the few, Control.Monad.State.Strict is not. And even with C.M.S.Lazy, You cannot use the state after an iterateM, you have to put a new state before you can continue the computation. To get something usable with other monads, we could add a count parameter,

iterCountM :: (Monad m) => Int -> (a -> m a) -> m a -> m [a]
iterCountM 0 _ _ = return []
iterCountM k step start = do
    first <- start
    rest <- iterCountM (k-1) step (step fisrt)
    return (first:rest)

so we lose flexibility, but gain usability in more monads.

edited Oct 28 '11 at 15:11

answered Oct 28 '11 at 11:04

Daniel Fischer

181,706
17
308
431

Many thanks for the response! Everything you say after "Breaking down your example..." makes perfect sense, and is exactly what I meant by "Let me say that I do understand what is going on here..." So thanks for confirming that. But I'm still not convinced by the interpretation. I intend for *A* to mean that the second element is the application of step to the fully evaluated result of the first element, `step ["foo"]` to equal `["bar", "bar"]`, which can never be the case, *regardless of state*. With that interpretation in mind, the example implements *B* but not *A*, so they cannot be equal. – Bilal Barakat Oct 28 '11 at 12:07
Running all the sequenced actions with the same initial state makes sense here. In my real application, however, the state also includes a random generator, and several stochastic events occur in each `step`. So even if I ensure the identical random generator is fed into the "first" step in each position in the result list, they might result in different outcomes, right? But then the second entry could no longer be interpreted as an evolution of the first... – Bilal Barakat Oct 28 '11 at 12:20
Ah, but disregarding the newtype wrappers, iterate gives you a list of functions, (modulo newtype and currying, `(>>= step) :: (([String],Int) -> ([String],Int)) -> (([String],Int) -> ([String],Int))`). Each function being the result of the application of this transformation to the previous function (fully evaluated or not doesn't matter due to referential transparency). You want to include the result of an application of the previous function in the determination of the next, that's a different pattern, not iterate. – Daniel Fischer Oct 28 '11 at 12:34
Yes, that's exactly the pattern I have in mind. So, that means with reference to `example2` (c.f. edit to the original question), the answer would be: "you don't rewrite using `iterate`!" ? What would be an equally compact/elegant/idiomatic way to do it, then? Like I said, I'd be interested in ways of doing this without accumulating the result inside the state. – Bilal Barakat Oct 28 '11 at 12:43
Daniel, if you'll put your second comment (with context) in a separate answer, I can vote that one up... (I think your first answer is certainly a very clear explanation of why `example` does what it does, it just happens that I was instead looking for guidance on how to make it do something else!) – Bilal Barakat Oct 28 '11 at 12:51
@Bilal, I added guidance to my original answer, hope it helps. – Daniel Fischer Oct 28 '11 at 13:20
that's incredibly helpful, thank you very much for your generous help! – Bilal Barakat Oct 28 '11 at 13:31
My only quibble would be that it's potentially confusing that the answer starts off with the claim "A = B", when it then goes on to show that `iterate` (which arguably performs *B*), and `iterateM` (which arguably performs *A*) really are different... If there's a way of phrasing A and B that would allow us to agree they are different (just like the two iterators are different), we could edit both question and answer accordingly. I think that would result in a more useful reference resource for someone in my position. I'm guessing it's not possible to edit the answer once I've accepted it? – Bilal Barakat Oct 28 '11 at 13:39
I think it's possible to edit even accepted answers, but let me think about clarifying it. The point is that at iterate's type, A and B are the same thing, but you want A' at a different type. – Daniel Fischer Oct 28 '11 at 13:47
@BilalBarakat A and B are the same thing. You're making the assumption that the initial state when you first call step is always 1. Due to your use of sequence, that's not true. In the context of State s, sequence means "perform the first action with the current state, pass the resulting state to the next action, and perform it, etc, update the current state to be the result of the last action". You seem to be assuming it's "perform each action with the current state". – Carl Oct 28 '11 at 15:57
@Carl Thanks for joining in, but Daniel has already cleared it all up for me. The piece of the puzzle I was missing is not at all the one you think, but never mind. Sorry for needing to ask, but as an SO newbie: am I supposed to edit the original question to avoid baiting further comments along those lines? – Bilal Barakat Oct 28 '11 at 18:05

score 3 · Answer 2 · answered Oct 29 '11 at 06:03

This may not answer the question you posed, but what you are doing sounds an awful lot like unfoldr.

step (seed, l) = Just (l', (seed', l'))
  where seed' = succ seed
        l' = (if seed == 1 then "foo" else "bar"):l

ghci> take 3 $ unfoldr step (1, [])
[["foo"], ["bar", "foo"], ["bar", "bar", "foo"]]

No monads needed. I'm sort of stabbing in the dark as you didn't specify what you are actually trying to do, but whether or not I got step right, the takeaway message is that unfoldr can be used for simple state threading, too.

unfoldr :: (seed -> Maybe (val, seed)) -> seed -> [val]

Thanks @Dan, nice tipp, and indeed I wasn't aware of unfoldr (although I had looked into using mapAccumL). I'll definitely add that to my toolkit, but for my present purpose, each "step" is pretty involved and might be expanded still, so the generality of the monad is probably useful to have. — Bilal Barakat, Oct 30 '11 at 16:07

Haskell: iterate in State, how to force the behaviour I want?

2 Answers2