0

I've recently taken to reading some literature on the compilation of functional languages and I would like some insight into how the parsing of the commonly-given EBNF grammar for, say, expressions is actually resolved.

For example, in a simple lambda calculus with constants:

expr ::=
   var 
 | literal
 | 'λ' var '.' expr
 | expr expr
 | '(' expr ')'

The case I'm concerned with is application; expr expr, which is both left and right recursive. How do parser implementors resolve this? For the purpose of the question, I'm mostly wondering about parser generators such as bison (I know that ANTLR seems to do some trickery to resolve this case).

Is it possible to factor this out? Is there a structured way to resolve this kind of ambiguity or is it purely heuristic? I gather that OCaml uses a parser-generator for its grammar but I've not been able to dissect it enough to understand what's going on.

Any insight would be much appreciated.

oldjohn1994
  • 359
  • 4
  • 12
  • Does this help: https://stackoverflow.com/questions/17933925/parsing-a-sequence-of-expressions-using-yacc/17935523#17935523? – rici Mar 14 '20 at 16:18
  • I've tried both of those, they just lead to enormous shift/reduce conflicts in the grammar. The whole `%prec ATOM` trick seems to only parse application correctly if it sees like "(f x)", what I'm trying to achieve is that: "f x y z" is parsed as (((f x) y) z). I'm not sure bison is capable enough to actually do this. – oldjohn1994 Mar 16 '20 at 06:55
  • Actually, it just turns out that I was getting false results by sitting looking at a program that just reads yyin = stdin and doesn't get to the "apply" rule until some other token is given. I've yet to test this at scale and see if it can handle the type of grammar I want but I appreciate your answer to that question. – oldjohn1994 Mar 16 '20 at 07:07
  • the precedence "trick", and the precedence declarations in the second grammar, are mostly about including haskell-style infix operators. You might not care about those, and if you don't just leave out the corresponding productions. I do actually agree that a precedence declaration for `ATOM` is a bit ugly, and I think that comes through in the six-year-old answer. As far as your question about curried syntax, yes yacc/bison is certainly capable of handling that. It's not complicatec. – rici Mar 16 '20 at 07:19
  • Grammars don't compose well. If you drop a grammar in the middle of another grammar which shares non-terminals, parse conflicts are likely. If one or both grammars rely on precedence in a non-trivial way, conflicts are even more likely. That's a drag, but it doesn't make CFGs unusable. – rici Mar 16 '20 at 07:22

0 Answers0