0

I've been trial-and-erroring to figure out from an intuitive level when a rule in antlr is left-recursive of not. For example, this (Removing left recursion) is left-recursive in theory, but works in Antlr:

// Example input: x+y+1

grammar DBParser;

expression
    : expression '+' term
    | term;

term
    : term '*' atom
    | atom;

atom
    : NUMBER
    | IDENTIFIER
    ;

NUMBER              : [0-9]+        ;
IDENTIFIER          : [a-zA-Z]+     ;

enter image description here

So what makes a rule left-recursive and a problem in antlr4, and what would be the simplest example of showing that (in an actual program)? I'm trying to practice remove left-recursive productions, but I can't even figure out how to intentionally add a left-recursive rule that antlr4 can't resolve!

samuelbrody1249
  • 4,379
  • 1
  • 15
  • 58

2 Answers2

1

Antlr4 can handle direct left-recursion as long as it is not hidden:

  • Hidden means that the recursion is not at the beginning of the right-hand side but it might still be at the beginning of the expansion because all previous symbols are nullable.

  • Indirect means that the first symbol on the right-hand side of a production for N eventually derives a sequence starting with N. Antlr describes that as "mutual recursion".

Here are some SO questions I found by searching for [antlr] left recursive:

ANTLR4 - Mutually left-recursive grammar

ANTLR4 mutually left-recursive error when parsing

ANTLR4 left-recursive error

ANTLR Grammar Mutually Left Recursive

Mutually left-recursive lexer rules on ANTL4?

Mutually left recursive with simple calculator

There are lots more, including one you asked. I'm sure you can mine some examples.

rici
  • 234,347
  • 28
  • 237
  • 341
  • does `epsilon` just mean 'empty' when doing `A' → αA' | ε` ? – samuelbrody1249 Aug 19 '22 at 23:51
  • 1
    Yes, ε means nothing. It would be more accurate to write that as `A' → αA' |`, but many people prefer nothing to be visible; hence the ε. Nothing is a very hard concept; making nothing visible tends to make you think that it's something, but nothing is not something. It's nothing. – rici Aug 19 '22 at 23:59
0

As mentioned by rici above, one of the items that Antlr4 does not support is indirect left-recursion. Here would be an example:

grammar DBParser;
expr: binop | Atom;
binop: expr '+' expr;
Atom: [a-z]+ | [0-9]+ ;

error(119): DBParser.g4::: The following sets of rules are mutually left-recursive [expr, binop]

Notice that Antlr4 can support the following direct left-recursion though:

grammar DBParser;
expr: expr '+' expr | Atom;
Atom: [a-z]+ | [0-9]+ ;

However, even if you add in parentheticals (for whatever reason?) it doesn't work:

grammar DBParser;
expr: (expr '+' expr) | Atom;
Atom: [a-z]+ | [0-9]+ ;

Or:

grammar DBParser;
expr: (expr '+' expr | Atom);
Atom: [a-z]+ | [0-9]+ ;

Both will raise:

error(119): DBParser.g4::: The following sets of rules are mutually left-recursive [expr]
samuelbrody1249
  • 4,379
  • 1
  • 15
  • 58
  • 1
    A parenthetical expression in the right-hand side of an Antlr rule is internally converted to a new non-terminal whose name doesn't appear anywhere in the grammar. So it effectively introduces indirectness (through the new non-terminal). – rici Aug 20 '22 at 00:01
  • I could add that to the answer, if you think it would help. – rici Aug 20 '22 at 00:52
  • @rici sure, or just copy into a new answer and I can delete mine -- whichever you prefer. – samuelbrody1249 Aug 20 '22 at 01:15