3

I'm trying to understand the rules for do notation.

Here is some code that typechecks:

fff :: Maybe Int
fff = do
    _ <- pure $ Just 100
    (+10)
    <$> Just 50

Which is basically fff = (+10) <$> Just 50. I would assume the above could would not type check - because surely each line should be within the context of Maybe which (+10) is not.

Why does the above typecheck? Here is a simpler example of the above:

fff :: Int -> Maybe Int
fff i = do
    (+10)
    <$> Just i

Why is the above considered valid syntax? Does that not 'desugar' to:

fff i = ((+10) >>= (\i -> fmap (Just i))) i

Which indeed gives a typecheck error in ghci.


Here is an example that does not typecheck following a similar indentation as above:

x :: Maybe Int
x = do
  _ <- Just 1
  undefined
  <$> Just 5

(Thanks to @cvlad from the FP slack chat for the above example)

Chris Stryczynski
  • 30,145
  • 48
  • 175
  • 286
  • FYI [`Do`-notation explained, in colors](https://stackoverflow.com/a/11326549/849891). – Will Ness Mar 07 '20 at 16:42
  • 2
    your questions are not about do notation, but about the inner workings of the GHC parser. so it failed to recognize that expression when you broke it into two lines, so what. don't do that without indenting the second line more than the first one! :) and you haven't included the error messages. I expect the errors are telling you about the sub-expressions involved and that way you can see how it read them. – Will Ness Mar 07 '20 at 16:49
  • 1
    @WillNess It's a good question. *Why* does the parser/typechecker accept one but not the other? – chepner Mar 07 '20 at 16:53
  • 1
    Clearly the first one conforms to *some* rule; if you don't know, you don't need to dismiss the question as not worth asking. – chepner Mar 07 '20 at 16:54
  • @Chris you've changed the question drastically. your interpretation of the do block as bind code is incorrect: `<$>` is *infix* binary operator. – Will Ness Mar 07 '20 at 17:00
  • `<$> Just i` on its own is ill formed expression. – Will Ness Mar 07 '20 at 17:04
  • Sorry have been staring at this code for a while so I'm a bit brain fried. But I think I've narrowed it down to what my question actually is. Essentially I was comparing two blocks of code and figuring out why one was type checking while the other was not. – Chris Stryczynski Mar 07 '20 at 17:04
  • So maybe a ghc bug? @WillNess – Chris Stryczynski Mar 07 '20 at 17:05
  • no, see my answer. But you don't give an example that fails, now. – Will Ness Mar 07 '20 at 17:07
  • it's hard to chase your changing question. :) next time please don't do that. the code changed, but the answer is the same. the do block *before* the `<$>` is `Maybe t` (because of `Just 1`), and we can't fmap *that*. – Will Ness Mar 07 '20 at 17:28
  • The *do notation* is an abstraction over an abstraction and i believe that it can easily be and in fact should be avoided. When dealing with monads use `>>=` and `>>` with proper indentation and it's as readable and as maintainable as supposedly imperative looking but sometimes deceiving *do notation*. Haskell code doesn't need to look imperative. [Do notation considered harmful](https://wiki.haskell.org/Do_notation_considered_harmful) – Redu Mar 07 '20 at 19:19
  • one more thing, in your first code snippet, `fff = do` // `_ <- pure $ Just 100` // `(+10)` // `<$> Just 50`, if you remove `pure $`, it will fail. because of `pure` the do block can be any monadic type, so to match the `(+10)` it becomes `((->) r) t`, so actually `fff = (\r -> let _ = const (Just 100) r in (+10) r) <$> Just 50` (which is indeed equivalent to `(+10) <$> Just 50`. "surely each line should be within the context of Maybe which (+10) is not" you missed the `pure` there, which makes the context polymorphic, `Just 100` is just a pure value there. – Will Ness Mar 08 '20 at 00:18

2 Answers2

4
fff :: Int -> Maybe Int
fff i = do
    (+10)
    <$> Just i

Why is the above considered valid syntax?

Because it is parsed as

fff i = do {        -- do { A } is just
    (+10) }         --      A
    <$> Just i

which is equivalent to

fff i =
    (+10) 
    <$> Just i

because <$> Just i on its own is an invalid expression (so fff i = ((+10) >>= (\i -> fmap (Just i))) i is incorrect translation), and that delimits the extent of the do block as per the rule quoted in @chi's answer.

Indeed its type is inferred as

fff :: Num b => b -> Maybe b

You second example works if you add a space before the <$> in the last line. Without the space, it is again parsed as

inputTest :: FormInput -> IO (Either [String] (Int, Int))
inputTest fi = do {
    allErrors' <- undefined :: IO [String]
    undefined }
    <$> ((liftM2 ) (,) <$> undefined <*> undefined) fi

because <$> ... on its own is invalid expression. Indeed when I add the explicit separators,

inputTest2 :: String -> IO (Either [String] (Int, Int))
inputTest2 fi = do {
    allErrors2 <- undefined :: IO [String] ;
    undefined  }
    <$> ((liftM2 ) (,) <$> undefined <*> undefined) fi

I get the exact same error message on TIO (had to use String instead of your type there).

Since the first undefined :: IO [String], the whole do block has some IO t type, and we can't fmap that over anything.

Always add all the explicit separators (in addition to practicing good indentation style), to avoid this weird syntax brittleness.


Your new example is

x :: Maybe Int
x = do          -- {     this is
  _ <- Just 1   -- ;       how it is
  undefined     -- }         parsed
  <$> Just 5

The code changed, but the answer is the same. The do block before the <$> is Maybe t (because of Just 1), and we can't fmap that.

Again, indent the last line some more and it'll compile, because undefined <$> Just 5 will now be parsed as one expression.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
4

This is a weird interaction.

I started to simplify the test case to this, which runs fine.

> x = do succ ; <$> Just 1
> x
Just 2

By comparison, this does NOT parse:

> y = do { succ ; <$> Just 1 }      
error: parse error

However, this parses:

> z = do { succ } <$> Just 1      
> z
Just 2

So, here's what I think is going on. Since token <$> can never start an expression, parse tentatively fails. The do parser rule is, essentially, a maximum munch rule: on a fail, add an implicit } and try again.

Because of this, x above is parsed as z. Since succ is a monadic value (line (+10) in OP's question) it can appear inside do. This makes type check succeed.

Quoting the Haskell Report 2.7

A close brace is also inserted whenever the syntactic category containing the layout list ends; that is, if an illegal lexeme is encountered at a point where a close brace would be legal, a close brace is inserted.

chi
  • 111,837
  • 3
  • 133
  • 218