Need explanation for basic do block syntax

Question

In ghci, I wrote:

 let x = do
    i <- [1..5]
    j <- [2..4]
    return i

Expected result:

[1,2,3,4,5]

Actual result:

[1,1,1,2,2,2,3,3,3,4,4,4,5,5,5]

I don't understand the logic behind that output. I think the reason might be something about monad, but I am very new to functional programming, I wish someone could explain it a little bit.

I've also tried the equavalent form in List-comprehension and the result is the same, which means there is something basic I misunderstood here.

If you're familiar with programming in an imperative language, this is comparable to a nested for loop. Otherwise you can look up how to desugar do syntax, work through the defns etc. — moonGoose, Aug 19 '20 at 11:54
try `[ (i,0) | i <- [1..5], j <- [2..4]]`, `[ (i,j) | i <- [1..5], j <- [2..4]]`. — Will Ness, Aug 19 '20 at 15:05
The syntax is easy. Your problem is that you don't understand how the list monad works. — chepner, Aug 19 '20 at 15:47
on that note, [Haskell Monad - How does Monad on list work?](https://stackoverflow.com/questions/51172904/haskell-monad-how-does-monad-on-list-work). — Will Ness, Aug 19 '20 at 16:29
related: [All combinations of elements of two lists in Haskell](https://stackoverflow.com/questions/32093912/all-combinations-of-elements-of-two-lists-in-haskell). — Will Ness, Aug 19 '20 at 20:35

jpmarinier · Answer 1 · 2020-08-19T19:22:46.377

6

This is because the do mechanism does not care (fortunately) about whether or not the innermost code actually refers to (some of) the loop variables.

See you always get 3*5=15 values regardless of innermost code:

 λ> 
 λ> xs1 = do { i <- [1..5] ; j <- [2..4] ; return i }
 λ> xs1
[1,1,1,2,2,2,3,3,3,4,4,4,5,5,5]
 λ> 
 λ> xs2 = do { i <- [1..5] ; j <- [2..4] ; return 9 }
 λ> xs2
[9,9,9,9,9,9,9,9,9,9,9,9,9,9,9]
 λ> 
 λ> xs3 = do { i <- [1..5] ; j <- [2..4] ; return (i,j) }
 λ> xs3
[(1,2),(1,3),(1,4),(2,2),(2,3),(2,4),(3,2),(3,3),(3,4),(4,2),(4,3),(4,4),(5,2),(5,3),(5,4)]
 λ> 
 λ> length xs1
15
 λ> length xs2
15
 λ> length xs3
15
 λ>

As far as I can tell, this is perfectly standard behavior, that Haskell shares with C, C++, Fortran, Python ...

A C++ equivalent example:

#include  <vector>
#include  <iostream>

int main()
{
    std::vector<int>  vi{1,2,3,4,5};
    std::vector<int>  vj{2,3,4};

    for (int i: vi)
        for (int j: vj)
            std::cout << i << ", ";

    std::cout << std::endl;

    return EXIT_SUCCESS;
}

C++ output:

$ ./a.out
1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 
$

edited Aug 19 '20 at 19:22

answered Aug 19 '20 at 12:54

jpmarinier

4,427
1
10
23

(I know "depends on" is the accepted jargon but it can actually be unclear what that means. using "refers to" instead could be clearer; also using `xs1 = do { i <- [1..5] ; j <- [2..4] ; return (i,0) }` and `xs1 = do { i <- [1..5] ; j <- [2..4] ; return (i,j) }` as the additional examples.) – Will Ness Aug 19 '20 at 15:02
Thanks for such a detailed explaination! – ICFSZ Aug 19 '20 at 18:48
@WillNess - Points taken, thanks. However I still prefer to keep the first example as is, because this is _verbatim_ from the OP's question text. – jpmarinier Aug 19 '20 at 19:25
yes, hence "additional". (I think (i,0) shows the structure of it better than just some random value.) – Will Ness Aug 19 '20 at 20:34

score 5 · Accepted Answer · answered Aug 19 '20 at 13:08

I've also tried the equavalent form in List-comprehension and the result is the same

Good idea. It so happens that for lists, do notation does exactly the same as list comprehensions. (In fact, there's a syntactic extension that allows you to use list-comprehension notation for any monad, like you can use do notation for any monad.)

So, you're asking why [a | a<-[0,1], b<-[2,3]] gives [0,0,1,1] instead of [0,1]. The way this looks surprising is if you think about list comprehensions as set comprehensions like you'd find in maths. But lists aren't sets, though Haskellers do often use lists as a makeshift stand-in for sets. If lists comprehensions acted as set comprehension, then

  [x | x <- [0,1,0]]

should also yield only [0,1] as its result (or at least, it should yield the same result as [x|x<-[0,1]] does).

In general this sort of weeding-out-duplicates requires equality checks, and if you want to make it efficient also either an ordering or a hashing method. Lists don't do any such thing, so if you want set-like behaviour you should use a set-implementing data structure. Set and HashSet are the most common.

Need explanation for basic do block syntax

2 Answers2