Taking an (almost) textbook example, where we expect multiplication to have precedence over addition, but also include an optional part to match.
expr : expr '*' expr ('ALSO')?
| expr '+' expr
| INT
;
INT: [0-9]+;
WS : [ \t\r\n]+ -> skip ;
When trying out the grammar with 3 * 4 + 2
we get an unexpected tree that looks like
expr:1
/ | \
expr:1 * expr:2
| / | \
3 expr:1 + expr:1
| |
4 2
However, when use 3 + 4 * 2
we get what I might expect
expr:1
/ | \
expr:1 + expr:2
| / | \
3 expr:1 * expr:1
| |
4 2
Also, if you switch the optional token to the second line, we get the expected tree every time.
expr : expr '*' expr
| expr '+' expr ('ALSO')?
| INT
;
I also tried this using the non-greedy operator ??
, and defining lexer tokens so we don't have to worry about oddities around ordering due to implicit tokens.
What would explain this ordering?