Documentation for the parsec package states that u
argument is used to carry some user state through monadic computation. But the same functionality can be achieved by basing ParsecT
monad transformer on State
monad. So if my parser is not stateful, i don't need u
altogether, but have to set it to ()
with parsec. What's rationale for adding non-optional state support to ParsecT
?
Asked
Active
Viewed 421 times
8

arrowd
- 33,231
- 8
- 79
- 110
-
What do you mean non-optional? How would it be if it were optional? – Shoe Jan 08 '15 at 08:16
-
It wouldn't be at all and the `ParsecT` type would have only 3 arguments - `s m a`. I meant that if user wants state, he can base on `State`. – arrowd Jan 08 '15 at 08:25
-
You can just use a type alias: `type YourParsec s = Parsec s ()`, no? – Shoe Jan 08 '15 at 08:44
-
4I suppose it's for optimisation purposes: they presumably found out that an explicit stack with `State` slows down parsing, whereas an unused `()` built-in state has almost no effect on performance. – leftaroundabout Jan 08 '15 at 08:52
-
1That’s an assumption (and a good one), but do we know for sure? – Joachim Breitner Jan 08 '15 at 12:22
-
My wild guess: originally, only the `Parsec s u` monad was designed, which included the state type `u`. Then, the transformer `ParsecT` was designed so to allow `Parsec s u a = ParsecT s u Identity a`. Now, the `u` was no longer needed, but the design stuck with it instead of having `Parsec s u a = ParsecT s (State u) a` and one fewer parameter to `ParsecT`. – chi Jan 08 '15 at 13:16
-
1You may want to look at http://stackoverflow.com/a/24981676/570682 - the user state is reset when trying alternatives (as with `(<|>)`) but an underlying `MonadState m` would not be. – Lambdageek Jan 08 '15 at 14:05
2 Answers
5
Because a parser of type ParsecT s () (State st) a
behaves differently from a parser of type Parsec s st Identity a
when it comes to backtracking:
- User state resets when parsec tries an alternative after a failing parse that consumes no input.
- But the underlying Monad
m
does not backtrack; all the effects that happened on the way to a final parse result are kept.
Consider the following example:
{-# LANGUAGE FlexibleContexts #-}
module Foo where
import Control.Applicative
import Control.Monad.State
import Text.Parsec.Prim hiding ((<|>), State(..))
import Text.Parsec.Error (ParseError)
tick :: MonadState Int m => ParsecT s Int m ()
tick = do
lift $ modify (+1)
modifyState (+1)
tickTock :: MonadState Int m => ParsecT s Int m ()
tickTock = (tick >> empty) <|> tick
-- | run a parser that has both user state and an underlying state monad.
--
-- Example:
-- >>> run tickTock
-- (Right 1,2)
run :: ParsecT String Int (State Int) () -> (Either ParseError Int, Int)
run m = runState (runParserT (m >> getState) initUserState "-" "") initStateState
where initUserState = 0
initStateState = 0
As you can see, the underlying state monad registered two ticks (from both alternatives that were tried), while the user state of the Parsec monad transformer only kept the successful one.

Lambdageek
- 12,465
- 1
- 25
- 33
1
ParsecT
carries it's own state already: parsing position and input: http://haddocks.fpcomplete.com/fp/7.8/20140916-162/parsec/Text-Parsec-Prim.html#t:State
So as leftaroundabout pointed out, it's probably due optimisation purposes.