18

I am trying to understand Haskell monads, reading "Monads for the Curious Programmer". I've run into the example of List Monad:

tossDie=[1,2,3,4,5,6]

toss2Dice = do
    n <- tossDie
    m <- tossDie 
    return (n+m)

main = print toss2Dice

The way do block produces m as 36-element list I sort of understand - it maps every element of n as a list of 6 elements and then concatenates those lists. What I do not understand is how n is changed by the presence of m <- tossDie, from 6 elements list to 36 elements. Obviously "we first bind n and then bind m" is wrong understanding here, but what is right?

I am also not fully clear on how two-argument function is applied in do block. I suspect it's a case of curying, but I am slightly hazy as to how exactly it works.

Can someone please explain the above two mysteries?

JB.
  • 40,344
  • 12
  • 79
  • 106
  • 1
    Do you already know what `do` notation desugars to? – leftaroundabout Nov 13 '13 at 18:52
  • 1
    Yes, I think I understand that part - that's how I figure out where `m` comes from. But obviously I do not understand enough :) –  Nov 13 '13 at 18:52
  • 3
    it's just `tossDie >>= (\n -> tossDie >>= (\m -> return (n+m)))` – Kristopher Micinski Nov 13 '13 at 18:55
  • 1
    possible duplicate of [Haskell do notation to bind](http://stackoverflow.com/questions/16964732/haskell-do-notation-to-bind) – Kristopher Micinski Nov 13 '13 at 18:58
  • Here is my understanding: as we invoke the entire thing, `n` is bound to 1,2,3,4,5,6 sequentially as map iterates over the result of tossDie. How do we arrive at the value of `n` that I see if I replace `return (n+m)` with `return n` ? –  Nov 13 '13 at 19:00
  • I now try to replace the >>= operator with the "real thing" (I believe that for list monads that's `concat` and `map`). I can get as far as `toss2Dice = (concat (map (\m->return m) (concat (map (\n -> tossDie) (tossDie)))))` . However, In this code I can't access `n`. What am I missing? –  Nov 13 '13 at 19:36
  • Looks like the key is the replacement of `do` with `>>=` one `<-` at a time! Thanks, @comingstorm –  Nov 13 '13 at 20:25
  • see if [this](http://stackoverflow.com/a/16805262/849891) also helps. Also, [this](http://www.haskell.org/haskellwiki/Keywords#.3C-). – Will Ness Nov 13 '13 at 20:59
  • Would you mind putting the output it produces into the question? – WW. Nov 14 '13 at 02:21
  • You can run it online at the tutorial pages (https://www.fpcomplete.com/school/starting-with-haskell/basics-of-haskell/function-application). The author of the tutorial provides huge number of live examples. Just replace the example's text with the code you want to run. –  Nov 14 '13 at 14:26

4 Answers4

14

For lists (such as tossDie), the do notation acts like a list comprehension -- that is to say, as if each variable binding were a nested foreach loop.

The do-block expression:

toss2Dice = do { n <- tossDie; m <- tossDie; return (n+m) }

does the same thing as this list comprehension:

toss2Dice = [ n+m | n <- tossDie, m <- tossDie ]

The result is comparable to the following imperative pseudocode:

toss2Dice = []
foreach n in tossDie:
    foreach m in tossDie:
        toss2Dice.push_back(n+m)

except that the Haskell examples generate their results lazily, on demand, instead of eagerly and all at once.


If you look at the monad instance for lists, you can see how this works:

instance Monad [] where
  xs >>= f = concat (map f xs)
  return x = [x]

Starting at the beginning of the do block, each variable binding creates a loop over the remainder of the block:

do { n <- tossDie; m <- tossDie; return (n+m) }
===>   tossDie >>= \n -> do { m <- tossDie; return (n+m) }
===>   concat ( map (\n -> do { m <- tossDie; return (n+m) }) tossDie )

Note that map function iterates over the items in the list tossDie, and the results are concatenated. The mapping function is the remainder of the do block, so the first binding effectively creates an outer loop around it.

Additional bindings create successively nested loops; and finally, the return function creates a singleton list out of each computed value (n+m) so that the "bind" function >>= (which expects lists) can concatenate them properly.

comingstorm
  • 25,557
  • 3
  • 43
  • 67
  • Could you comment some more on the actual substitutions that take place in Haskell? In particular, the substitution that leads from >>= to the actual implementation of >>= for List Monad? –  Nov 13 '13 at 19:39
  • And continuing your method, replacing the inner `do`, I get what I wanted: `toss2Dice = concat ( map (\n -> concat (map (\m->return (m+n)) tossDie)) tossDie )`. Thank you! –  Nov 13 '13 at 20:29
13

The interesting bit I assume is this:

toss2Dice = do
  n <- tossDie
  m <- tossDie 
  return (n+m)

This is somewhat equivalent to the following Python:

def toss2dice():
    for n in tossDie:
        for m in tossDie:
            yield (n+m)

When it comes to the list monad, you can view the binding arrows (<-) in do notation as traditional imperative "foreach" loops. Everything after

n <- tossDie

belongs to the "loop body" of that foreach loop, and so will be evaluated once for every value in tossDie assigned to n.

If you want the desugaring from do notation to actual bind operators >>=, it looks like this:

toss2Dice =
  tossDie >>= (\n ->
    tossDie >>= (\m ->
      return (n+m)
    )
  )

Notice how the "inner loop body"

(\n ->
  tossDie >>= (\m ->
    return (n+m)
  )
)

Will get executed once for every value in tossDie. This is pretty much the equivalent to the nested Python loops.


Technical mumbo-jumbo: The reason you get "foreach" loops from the binding arrows has to do with the particular monad you are working with. The arrows mean different things for different monads, and to know what they mean for a particular monad you have to do some sleuthing and figure out how that monad works in general.

The arrows get desugared into calls to the bind operator, >>=, which works differently for different monads as well – this is the reason the bind arrows <- also work differently for different monads!

In the case of the list monad, the bind operator >>= takes a list to the left and a function returning a list to the right, and sort of applies that function to every element of the list. If we want to double every element in a list in a cumbersome way, we could imagine doing it as

λ> [1, 2, 3, 4] >>= \n -> return (n*2)
[2,4,6,8]

(return is required to make the types work out. >>= expects a function that returns a list, and return will, for the list monad, wrap a value in a list.) To illustrate a perhaps more powerful example, we can start by imagining the function

λ> let posneg n = [n, -n]
λ> posneg 5
[5,-5]

Then we can write

λ> [1, 2, 3, 4] >>= posneg
[1,-1,2,-2,3,-3,4,-4]

to count the natural numbers between -4 and 4.

The reason the list monad works this way is that this particular behaviour of the bind operator >>= and return makes the monad laws hold. The monad laws are important for us (and perhaps the adventurous compiler) because they let us change code in ways that we know will not break anything.

A very cute side effect of this is that it makes lists very handy to represent uncertainty in values: Say you are building an OCR thingey that's supposed to look at an image and turn it into text. You might encounter a character that could be either 4 or A or H, but you are not sure. By letting the OCR thingey work in the list monad and return the list ['A', '4', 'H'] you have covered your bases. Actually working with the scanned text then become very easy and readable with do notation for the list monad. (It sorta looks like you're working with single values, when in fact you are just generating all possible combinations!)

kqr
  • 14,791
  • 3
  • 41
  • 72
  • I think I am getting closer to understanding here. I need to ponder on the meaning of `yield` in your Python. –  Nov 13 '13 at 19:04
  • @Arkadiy `yield` is like `return` in Python, but it returns one element of a list at a time. Instead of saying `return [7, 2, 3]` you can say `yield 7; yield 2; yield 3`. Similarly, instead of `res = []; for n in [1, 2, 3]: res.append(n); return res` you can do `for n in [1, 2, 3]: yield n`. – kqr Nov 13 '13 at 19:41
  • Yes, I know how yield works in Python. I am trying to figure out how yield maps to haskell. Somehow, return(n+m) is the innermost function that has access to both n and m as they are bound by two maps. I can't quite get there from `do` block. The first step is easy, provided in the comment below. The second step involves unwinding `bind` for list monad, and that's where I have difficulties. –  Nov 13 '13 at 19:49
  • kqr, could you correct my interpretation of bind (it's in the comments on the question, but I am pasting it here also): `toss2Dice = (concat (map (\m->return m) (concat (map (\n -> tossDie) (tossDie)))))` –  Nov 13 '13 at 19:51
  • @Arkadiy I've edited my answer to provide a translation into bind operators `>>=`, if that is the part you are getting stuck on. – kqr Nov 13 '13 at 19:55
4

Adding to @kqr answer:

>>= for list monad is actually concatMap, a function that maps elements to lists of elements and concatenates the lists, but with arguments flipped:

concatMap' x f = concat (map f x)

or alternatively

concatMap' = flip concatMap

return is just

singleElementList x = [x]

Now we can replace >>= with concatMap' and singleElementList:

toss2Dice =
  concatMap' tossDie (\n ->
    concatMap' tossDie (\m ->
      singleElementList (n+m)
    )
  )

Now we can replace the 2 functions with their bodies:

toss2Dice =
  concat (map (\n ->
    concat (map (\m ->
      [n+m]
    ) tossDice)
  ) tossDice)

Remove extra line breaks:

toss2Dice = concat (map (\n -> concat (map (\m -> [n+m]) tossDice)) tossDice)

Or shorter with concatMap:

toss2Dice = concatMap (\n -> concatMap (\m -> [n+m]) tossDice) tossDice
nponeccop
  • 13,527
  • 1
  • 44
  • 106
3

following nponeccop's advice, with

for = flip concatMap

your code becomes

toss2Dice = 
    for {- n in -} tossDie {- call -}
        (\n-> for {- m in -} tossDie {- call -}
                  (\m-> [n+m]))

where it is plainly visible that we have nested functions, one inside another; so the inner function (\m-> [n+m]), being in the scope of outer function's argument n, has access to it (to the argument n that is). So it uses the value of the argument to the outer function, which is the same on each invocation of the inner function, however many times it is invoked during the same invocation of the outer function.

This can be re-written with named functions,

toss2Dice = 
    for {- each elem in -} tossDie {- call -} g
    where g n = for {- each elem in -} tossDie {- call -} h
                where h m = [n+m] 

The function h is defined inside g, i.e. in g's argument's scope. And that's how h gets to use both m and n even though only m is its argument.

So in fact we do indeed "first bind n and then bind m" here. In nested fashion, that is.

Community
  • 1
  • 1
Will Ness
  • 70,110
  • 9
  • 98
  • 181