3

I'm trying to built C-- compiler using ANTLR 3.4.

Full set of the grammar listed here,

program         : (vardeclaration | fundeclaration)*                    ;
vardeclaration  : INT ID (OPENSQ NUM CLOSESQ)? SEMICOL  ;

fundeclaration  : typespecifier ID OPENP params CLOSEP compoundstmt     ;
typespecifier   : INT | VOID                                            ;
params          : VOID | paramlist                                      ;
paramlist       : param (COMMA param)*                                  ;
param           :  INT ID (OPENSQ CLOSESQ)?         ;

compoundstmt    : OPENCUR vardeclaration* statement* CLOSECUR           ;
statementlist   : statement*                                            ;

statement       : expressionstmt | compoundstmt | selectionstmt | iterationstmt | returnstmt;
expressionstmt  : (expression)? SEMICOL;
selectionstmt   : IF OPENP expression CLOSEP statement (options {greedy=true;}: ELSE statement)?;
iterationstmt   : WHILE OPENP expression CLOSEP statement;
returnstmt      : RETURN (expression)? SEMICOL;

expression      : (var EQUAL expression) | sampleexpression;
var             : ID ( OPENSQ expression CLOSESQ )? ;

sampleexpression: addexpr ( ( LOREQ | LESS | GRTR | GOREQ | EQUAL | NTEQL) addexpr)?;
addexpr         : mulexpr ( ( PLUS | MINUS ) mulexpr)*;
mulexpr         : factor  ( ( MULTI | DIV  ) factor )*; 

factor          : ( OPENP expression CLOSEP ) | var | call  | NUM;
call            : ID OPENP arglist? CLOSEP;
arglist         : expression ( COMMA expression)*;

Used lexer rules as following,

ELSE    : 'else'    ;
IF      : 'if'      ;
INT     : 'int'     ;
RETURN  : 'return'  ;
VOID    : 'void'    ;
WHILE   : 'while'   ;

PLUS    : '+' ;
MINUS   : '-' ;
MULTI   : '*' ;
DIV     : '/' ;

LESS    : '<'  ;
LOREQ   : '<=' ;
GRTR    : '>'  ;
GOREQ   : '>=' ;

EQUAL   : '==' ;
NTEQL   : '!=' ;
ASSIGN  : '='  ;

SEMICOL : ';' ;
COMMA   : ',' ;

OPENP   : '(' ;
CLOSEP  : ')' ;
OPENSQ  : '[' ;
CLOSESQ : ']' ;
OPENCUR : '{' ;
CLOSECUR: '}' ;

SCOMMENT: '/*' ;
ECOMMENT: '*/' ;


ID  : ('a'..'z' | 'A'..'Z')+/*(' ')*/ ;
NUM : ('0'..'9')+ ;
WS  : (' ' | '\t' | '\n' | '\r')+ {$channel = HIDDEN;};
COMMENT: '/*' .* '*/' {$channel = HIDDEN;};

But I try to save this it give me the error,

error(211): /CMinusMinus/src/CMinusMinus/CMinusMinus.g:33:13: [fatal] rule expression has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2.  Resolve by left-factoring or using syntactic predicates or using backtrack=true option.
 |---> expression       : (var EQUAL expression) | sampleexpression;

1 error

How can I resolve this problem?

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
DarRay
  • 2,550
  • 6
  • 27
  • 39

2 Answers2

2

As already mentioned: your grammar rule expression is ambiguous: both alternatives in that rule start, or can be, a var.

You need to "help" your parser a bit. If the parse can see a var followed by an EQUAL, it should choose alternative 1, else alternative 2. This can be done by using a syntactic predicate (the (var EQUAL)=> part in the rule below).

expression
 : (var EQUAL)=> var EQUAL expression
 |               sampleexpression
 ;

More about predicates in this Q&A: What is a 'semantic predicate' in ANTLR?

Community
  • 1
  • 1
Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
1

The problem is this:

expression      : (var EQUAL expression) | sampleexpression;

where you either start with var or sampleexpression. But sampleexpression can be reduced to var as well by doing sampleexpression->addExpr->MultExpr->Factor->var

So there is no way to find a k-length predicate for the compiler.

You can as suggested by the error message set backtrack=true to see whether this solves your problem, but it might lead not to the AST - parsetrees you would expect and might also be slow on special input conditions. You could also try to refactor your grammar to avoid such recursions.

stryba
  • 1,979
  • 13
  • 19
  • `backtrack=true` removes given error. But the result seems to be wrong. What does backtrack=true actually means? – DarRay Mar 03 '12 at 11:24
  • 1
    That's what I meant with result might not be as expected since you have a ambiguous grammar. backtrack tries to match input tokens as far as possible and if it hits a dead end (speak no alternatives) it goes back to the last ambiguous alternative it found and tries again from there until all possible paths are tried. – stryba Mar 03 '12 at 11:58