8

Using Megaparsec 5. Following this guide, I can achieve a back-tracking user-state by combining StateT and ParsecT (non-defined types should be obvious/irrelevant):

type MyParser a = StateT UserState (ParsecT Dec T.Text Identity) a

if I run a parser p :: MyParser a, like this:

parsed = runParser (runStateT p initialUserState) "" input

The type of parsed is:

Either (ParseError Char Dec) (a, UserState)

Which means, in case of error, the user state is lost.

Is there any way to have it in both cases?

EDIT: Could I perhaps, in case of error, use a custom error component instead of Dec (a feature introduced in 5.0) and encapsulate the user state in there?

cornuz
  • 2,678
  • 18
  • 35
  • does it work if you try composing your monad stack differently? `ParsecT Dec T.Text (StateT UserState) a` – hao Sep 30 '16 at 20:47
  • In this case the state would be available at the end of parsing, but it would not be back-tracking with the parser. – cornuz Sep 30 '16 at 21:16
  • The type of `runStateT p initialUserState` is `Parser (a, UserState)` - conceptually this could just be the parser which unconditionally fails and never produces a value, so the 'state' doesn't exist - what could such a function produce in this case? Surely not `(UserState, Either (..) (a, UserState))`. – user2407038 Sep 30 '16 at 22:37
  • @user2407038 the state would exist in any case, why not? If the parser fails immediately, it would probably be the initial user state. But I agree that the function signature would look odd. – cornuz Oct 01 '16 at 08:52

3 Answers3

2

You can use a custom error component combined with the observing function for this purpose (see this great post for more information):

{-# LANGUAGE RecordWildCards #-}

module Main where

import Text.Megaparsec
import qualified Data.Set as Set
import Control.Monad.State.Lazy

data MyState = MyState Int deriving (Ord, Eq, Show)
data MyErrorComponent = MyErrorComponent (Maybe MyState) deriving (Ord, Eq, Show)

instance ErrorComponent MyErrorComponent where
    representFail _ = MyErrorComponent Nothing 
    representIndentation _ _ _= MyErrorComponent Nothing 

type Parser = StateT MyState (Parsec MyErrorComponent String)

trackState :: Parser a -> Parser a
trackState parser = do
    result <- observing parser -- run parser but don't fail right away
    case result of
        Right x -> return x -- if it succeeds we're done here
        Left ParseError {..} -> do
            state <- get -- read the current state to add it to the error component
            failure errorUnexpected errorExpected $
                if Set.null errorCustom then Set.singleton (MyErrorComponent $ Just state) else errorCustom

In the above snipped, observing functions a bit like a try/catch block that catches a parse error, then reads the current state and adds the it to the custom error component. The custom error component in turn is returned when runParser returns a ParseError.

Here's a demonstration how this function could be used:

a = trackState $ do
    put (MyState 6)
    string "foo"

b = trackState $ do
    put (MyState 5)
    a

main = putStrLn (show $ runParser (runStateT b (MyState 0)) "" "bar") 

In reality you would probably want to do something more clever (for instance I imagine you could also add the entire stack of states you go through while traversing the stack).

DanielM
  • 1,023
  • 8
  • 18
1

You could try sandwiching ParserT between two States, like

type MyParser a = StateT UserState (ParsecT Dec T.Text (State UsersState)) a

And write special-purpose put and modify operations that, after changing the outer state, copy the entire state into the inner State monad using put.

That way, even if parsing fails, you'll have the last "state before failure" available from the inner State monad.

danidiaz
  • 26,936
  • 4
  • 45
  • 95
  • This is precisely what I have been doing so far with Parsec. I was just hoping that Megaparsec could make things simpler. – cornuz Oct 01 '16 at 08:38
0

I hit similar problem. I use default typing state:

type SubDefPos = Int
type SubDefName = String

data MyParserSt = MyParserSt {
   subDefs :: [(SubDefPos, SubDefName)]
} 

ParsecT Void String (StateT MyParserSt Identity) Expr

Every change to user state is supplied with the value of getOffset to be able to reject later if current position is less than position from the state.

Daniil Iaitskov
  • 5,525
  • 8
  • 39
  • 49