5

I'm trying to parse standard simple types (in the sense of lambda calculus) using FParsec, but I've having difficulty going from a Lex/Yacc style to that used in FParsec, specifically with respect to recursive definitions.

Examples of the types I am trying to parse are:

  • o
  • o -> o
  • (o -> o -> o) -> o

And here is my attempt:


    type SType =
      | Atom
      | Arrow of SType * SType

    let ws = spaces
    let stype, styperef = createParserForwardedToRef()

    let atom = pchar 'o' .>> ws |>> (fun _ -> Atom)

    let arrow = pipe2 (stype .>> (pstring "->" .>> ws)) 
                      stype
                      (fun t1 t2 -> Arrow (t1,t2))

    let arr = parse {
                let! t1 = stype
                do!  ws
                let! _  = pstring "->"
                let! t2 = stype
                do!  ws
                return Arrow (t1,t2)
              }

    styperef := choice [ pchar '(' >>. stype .>> pchar ')';
                         arr;
                         atom ]

    let _ = run stype "o -> o"`

When I load this into the interactive the last line causes a stack overflow (ironically quite hard to search for these days). I can imagine why, given that there are recursive references, but I would have thought the one token lookahead would have prevented following the first (bracketed) choice in stype. I assume therefore that it must be choosing arr, which chooses stype, and so on. But how to prevent this loop?

I'm interested in comments on idiomatic use of the library as well as corrections to my attempted solution.

rneatherway
  • 553
  • 3
  • 11
  • This could be want you want: http://stackoverflow.com/questions/6186230/recursive-grammars-in-fparsec – dave jones Jul 20 '11 at 16:10
  • Thanks, I have read that question/reply, but I can't quite see how to carry over the answer to my problem. I will have another look though. – rneatherway Jul 20 '11 at 16:16

2 Answers2

4

When you work with FParsec you need to parse sequences with the help of the sequence combinators instead of left-recursion. In your case you could for example use the sepBy1 combinator:

open FParsec

type SType =
     | Atom
     | Arrow of SType * SType

let ws = spaces : Parser<unit, unit>
let str_ws s = pstring s >>. ws

let stype, stypeRef = createParserForwardedToRef()

let atom = str_ws "o" >>% Atom

let elem = atom <|> between (str_ws "(") (str_ws ")") stype

do stypeRef:= sepBy1 elem (str_ws "->") 
              |>> List.reduceBack (fun t1 t2 -> Arrow(t1, t2))

let _ = run stype "o -> o"
Stephan Tolksdorf
  • 3,062
  • 21
  • 28
  • I keep forgetting sepBy. Nice answer! – dave jones Jul 20 '11 at 21:50
  • This is great thanks, like the use of >>% as well. However, this doesn't capture the right-associativity of "->". I changed the definition of stypeRef to `chainr1 elem ((str_ws "->") >>% (fun t1 t2 -> Arrow (t1,t2)))`, although presumably you could use a right-associative version of `List.reduce` as well. – rneatherway Jul 21 '11 at 13:48
  • I've replaced `reduce` with `reduceBack` to parse the arrow as a right-associative operator. I find the straightforward solution using `sepBy` and `reduceBack` cleaner than the `chainr` implementation, especially because the reduction function is a constant. Since the arrow is a right-associative operator, you always have to build some kind of intermediate stack or list with the elements of the sequence, so using `chainr1` here also has no efficiency advantage. On the contrary, it should be a bit slower, as it also needs to keep record of the parsed reduction functions. – Stephan Tolksdorf Jul 21 '11 at 19:10
0

This runs, but is probably hacked together too much. The type Parser... stuff is from the FParsec docs to avoid compiler errors.

type SType = 
    | Atom 
    | Arrow of SType * SType

type UserState = unit
type Parser<'t> = Parser<'t, UserState>


let ws = spaces

let atom : Parser<_> = pchar 'o' .>> ws |>> (fun _ -> Atom)

let rec term =
    parse {
        // Force something to come before another term.  Either
        // an atom, or a term in parens.
        let! first = choice [atom;
                             (pstring "(" .>> ws) >>. term .>> 
                              (pstring ")" .>> ws)]

        // Check if this is an arrow.  If not, just return first.
        let! res = choice [((pstring "->") .>> ws) >>. term |>> (fun x ->
                               Arrow (first, x));
                           preturn first]
        return res
        }
dave jones
  • 356
  • 2
  • 6