0

Ive defining a grammar for arithmetric expressions using the following syntax. Its a subset of a more complicated whole, but the problems only occured when i extended the grammar to include Logical Operations.

When I try to code gen using antlrworks it take a very long time to even start generating. I think the problems is in the rule for paren, as it includes a loop to the start of expr. Any help in fixing this would be great

Thanks in advance

the options used:

options {   
tokenVocab = MAliceLexer;
backtrack = true;
}

code for the Grammar is below:

type returns [ASTTypeNode n]
: NUMBER {$n = new IntegerTypeNode();}
| LETTER {$n = new CharTypeNode();}
| SENTENCE { $n = new StringTypeNode();}
;   

term  returns [ASTNode n]
 : IDENTIFIER {$n = new IdentifierNode($IDENTIFIER.text);}
| CHAR {$n = new LetterNode($CHAR.text.charAt(1));} 
| INTEGER {$n = new NumberNode(Integer.parseInt( $INTEGER.text ));}
| STRING { $n = new StringNode( $STRING.text ); } 
 ;

paren returns [ASTNode n]
:term { $n = $term.n; }
|  LPAR expr RPAR { $n = $expr.n; }
;

negation returns [ASTNode n]
:BITNEG (e = negation) {$n = new BitNotNode($e.n);}
| paren {$n = $paren.n;}
;

unary returns [ASTNode n]
:MINUS (u =unary) {$n = new NegativeNode($u.n);}
| negation {$n = $negation.n;} 
;

mult returns [ASTNode n]
 :  unary DIV (m = mult)  {$n = new DivideNode($unary.n, $m.n);}
 | unary MULT (m = mult) {$n = new MultiplyNode($unary.n, $m.n);}
 | unary MOD (m=mult)  {$n = new ModNode($unary.n, $m.n);} 
 | unary  {$n = $unary.n;}
 ;

binAS returns [ASTNode n]
 : mult PLUS (b=binAS)  {$n = new AdditionNode($mult.n, $b.n);}
 | mult MINUS (b=binAS)  {$n = new SubtractionNode($mult.n, $b.n);} 
 | mult  {$n = $mult.n;}
 ;

 comp returns [ASTNode n]
: binAS GREATEREQ ( e =comp)  {$n = new GreaterEqlNode($binAS.n, $e.n);}
|binAS GREATER ( e = comp )  {$n = new GreaterNode($binAS.n, $e.n);}
|binAS LESS ( e = comp )  {$n = new LessNode($binAS.n, $e.n);}
|binAS LESSEQ ( e = comp )  {$n = new LessEqNode($binAS.n, $e.n);}
|binAS {$n = $binAS.n;}
;

equality returns [ASTNode n]
: comp EQUAL ( e = equality)  {$n = new EqualNode($comp.n, $e.n);}
|comp NOTEQUAL ( e = equality )  {$n = new NotEqualNode($comp.n, $e.n);}
|comp { $n = $comp.n; }
;   

bitAnd returns [ASTNode n]
: equality BITAND (b=bitAnd) {$n = new BitAndNode($equality.n, $b.n);}
| equality {$n = $equality.n;} 
;

bitXOr returns [ASTNode n]
: bitAnd BITXOR (b = bitXOr) {$n = new BitXOrNode($bitAnd.n, $b.n);}
| bitAnd {$n = $bitAnd.n;}
 ;    

bitOr returns [ASTNode n]
: bitXOr BITOR (e =bitOr) {$n = new BitOrNode($bitXOr.n, $e.n);}
| bitXOr {$n = $bitXOr.n;} 
    ;   

logicalAnd returns [ASTNode n]
: bitOr LOGICALAND (e = logicalAnd){ $n = new LogicalAndNode( $bitOr.n, $e.n ); }
| bitOr { $n = $bitOr.n;  }
;       

expr returns [ASTNode n]
: logicalAnd LOGICALOR ( e = expr ) { $n = new LogicalOrNode( $logicalAnd.n, $e.n); }
| IDENTIFIER INC {$n = new IncrementNode(new IdentifierNode($IDENTIFIER.text));}
    | IDENTIFIER DEC {$n = new DecrementNode(new IdentifierNode($IDENTIFIER.text));} 
    | logicalAnd {$n = $logicalAnd.n;} 
;

`

CNevin561
  • 143
  • 1
  • 3
  • 12

1 Answers1

1

This seems to be a bug introduced in version 3.3 (and upwards). ANTLR 3.2 produces the following error when generating a parser from your grammar:

warning(205): Test.g:31:2: ANTLR could not analyze this decision in rule equality; often this is because of recursive rule references visible from the left edge of alternatives. ANTLR will re-analyze the decision with a fixed lookahead of k=1. Consider using "options {k=1;}" for that decision and possibly adding a syntactic predicate. error(10): internal error: org.antlr.tool.Grammar.createLookaheadDFA(Grammar.java:1279): could not even do k=1 for decision 6; reason: timed out (>1000ms)

It looks to me you've used an LR grammar as the basis for your ANTLR grammar. Consider starting over but then with LL parsing in mind. Have a look at the following Q&A to see how to parse expressions using ANTLR: ANTLR: Is there a simple example?

Also, I see you're using some tokens that look an awful lot like each other: LETTER, CHAR, SENTENCE and IDENTIFIER. You must realize that if all of them may start with, for example, a lower case letter, only one of the rules is matched (the one that matches most, or in case of a tie, the one defined first in the lexer grammar). The lexer does not produce tokens based on what the parser "asks" for, it creates tokens independently from the parser.

Finally, for a simple expression parser, you really don't need predicates (and backtrack=true causes ANTLR to automatically inserts predicates in front of all parser rules!).

Community
  • 1
  • 1
Bart Kiers
  • 166,582
  • 36
  • 299
  • 288