1

Here is my .g4 file:

grammar Hello;

start : compilation;
compilation : sql*;
sql : altercommand;
altercommand : ALTER TABLE SEMICOLON;
ALTER: 'alter';
TABLE: 'table';
SEMICOLON : ';';

My main class:

public class Main {
    public static void main(String[] args) throws IOException {
        ANTLRInputStream ip = new ANTLRInputStream("altasdere table ; alter table ;");
        HelloLexer lex = new HelloLexer(ip);
        CommonTokenStream token = new CommonTokenStream(lex);
        HelloParser parser = new HelloParser(token);

        parser.setErrorHandler(new CustomeErrorHandler());

        System.out.println(parser.start().toStringTree(parser));            
    }    
}

My CutomErrorHandler class:

public class CustomeErrorHandler extends DefaultErrorStrategy {

    @Override
    public void recover(Parser recognizer, RecognitionException e) {
        super.recover(recognizer, e);
        TokenStream tokenStream = (TokenStream) recognizer.getInputStream();

        if (tokenStream.LA(1) == HelloParser.SEMICOLON) {
            IntervalSet intervalSet = getErrorRecoverySet(recognizer);
            tokenStream.consume();
            consumeUntil(recognizer, intervalSet);
        }
     }
 }

When I give input altasdere table ; alter table ; it wont parse the second command as it has found the error in first one. The output of my main class is

line 1:0 token recognition error at: 'alta'
line 1:4 token recognition error at: 's'
line 1:5 token recognition error at: 'd'
line 1:6 token recognition error at: 'e'
line 1:7 token recognition error at: 'r'
line 1:8 token recognition error at: 'e'
line 1:9 token recognition error at: ' '
(start compilation)
ggorlen
  • 44,755
  • 7
  • 76
  • 106
Adib Rajiwate
  • 405
  • 4
  • 19

1 Answers1

1

In The Definitive ANTLR 4 Reference, section 9.5 Altering ANTLR’s Error Handling Strategy, I can read :

The default error handling mechanism works very well, but there are a few atypical situations in which we might want to alter it.

Is your grammar so atypical that you need to process token recognition error ? Personally I would write a grammar which is free of errors at the Lexer level, like the following.

File Question.g4 :

grammar Question;

question
@init {System.out.println("Question last update 0712");}
    :   sql+ EOF
    ;

sql
    :   alter_command
    |   erroneous_command
    ;

alter_command
    :   ALTER TABLE SEMICOLON
        {System.out.println("Alter command found : " + $text);}
    ;

erroneous_command
    :   WORD TABLE? SEMICOLON
        {System.out.println("Erroneous command found : " + $text);}
    ;

ALTER     : 'alter' ;
TABLE     : 'table' ;
WORD      : [a-z]+ ;
SEMICOLON : ';' ;
WS        : [ \t\r\n]+ -> channel(HIDDEN) ;

Note that the WORD rule must come after ALTER, see disambiguate or here.

File t.text :

altasdere table ; alter table ;

Execution :

$ grun Question question -tokens -diagnostics t.text
[@0,0:8='altasdere',<WORD>,1:0]
[@1,9:9=' ',<WS>,channel=1,1:9]
[@2,10:14='table',<'table'>,1:10]
[@3,15:15=' ',<WS>,channel=1,1:15]
[@4,16:16=';',<';'>,1:16]
[@5,17:17=' ',<WS>,channel=1,1:17]
[@6,18:22='alter',<'alter'>,1:18]
[@7,23:23=' ',<WS>,channel=1,1:23]
[@8,24:28='table',<'table'>,1:24]
[@9,29:29=' ',<WS>,channel=1,1:29]
[@10,30:30=';',<';'>,1:30]
[@11,31:31='\n',<WS>,channel=1,1:31]
[@12,32:31='<EOF>',<EOF>,2:0]
Question last update 0712
Erroneous command found : altasdere table ;
Alter command found : alter table ;

As you can see, the erroneous input has been absorbed by the WORD token. Now it should be easy to process or ignore the erroneous command in the listener/visitor.

BernardK
  • 3,674
  • 2
  • 15
  • 10
  • 1
    Sometimes it's worth to not to let the lexer handle certain cases, as it would produce error messages that don't mean much to the end user. Take as example escape sequences. Instead of refusing invalid ones at lexer level it is better to do checks in a semantic phase, which allows messages much more to the point. – Mike Lischke Oct 13 '17 at 08:19