0

I was wondering how to check that all paths have a return in a function during syntax analysis. So say I have the following in the Lexer

RETURN: 'return';
PRINT: 'print';
IF:'if';
ELSE: 'else';
THEN:'then';
PLUS: '+';
MINUS:'-';
EQUALS: '==';
DIGIT: '0'..'9';
OPEN:'{';
CLOSE:'}';
STRING: [a..zA..Z]+;
SEMICOLON: ';';

and parser

function: STRING OPEN statement CLOSE
statement: RETURN expr | PRINT expr | IF expr THEN statement ELSE statement | statement SEMICOLON statement;
expr: DIGIT| expr PLUS expr | expr MINUS expr | expr EQUALS expr;

My question is that a valid function should have a return statement and nothing after it. So a valid one is

test { return 2+2 }

or

test{ if 2 == 2 then return 2 else return 3 }

an invalid one would be where there is unreachable code after the return. For example.

test{return 2; print 3}

How would I go about checking that there is nothing after return statements?

my main java method looks something like this:

MyLexer mylexer = new MyLexer(new ANTLRInputStream(System.in));
CommonTokenStream toks = new CommonTokenStream(mylexer);
MyParser parser = new MyParser(tokens);
ParseTree parseTree = parser.program();
  • 1
    There's no way to enforce this via the grammar. You can check it in your semantic actions, but usually wouldn't check it in the parser at all, but in the semantic analysis phase. – sepp2k Oct 28 '17 at 22:46
  • Thank you very much – Anon ymous Oct 29 '17 at 11:26
  • @sepp2k: why can't you enforce that with a grammar? It's a simple syntactic feature. – rici Nov 01 '17 at 13:23
  • @rici No, you're right, I didn't think about it properly. It'd still lead to an annoying amount of duplication (different productions for blocks that appear at the end of a function than those that don't), so I'd still recommend just checking afterwards. – sepp2k Nov 01 '17 at 13:27

1 Answers1

0

I'm not an expert and had never use error reporting until today, but if you just want a list of errors, you could do the following, inspired by chapter 9 of The Definitive ANTLR 4 Reference.

File Question.g4 :

grammar Question;

/* Detecting invalid input after a return statement. */

question
@init {System.out.println("Question last update 1302");}
    :   function+ EOF
    ;

function
    :   STRING OPEN statement_block CLOSE
    ;

statement_block
    :   statement* if_last? return_statement
    ;

statement
    :   PRINT expr
    |   IF expr THEN statement ELSE statement
    |   statement SEMICOLON statement
    ;

if_last
    :   IF expr THEN statement_block ELSE statement*
    ;

return_statement
    :   RETURN expr
    ;

expr
    :   DIGIT
    |   expr PLUS expr
    |   expr MINUS expr
    |   expr EQUALS expr
    ;

CLOSE  : '}' ;
ELSE   : 'else' ;
EQUALS : '==' ;
IF     : 'if' ;
MINUS  : '-' ;
OPEN   : '{' ;
PLUS   : '+' ;
PRINT  : 'print' ;
RETURN : 'return' ;
THEN   : 'then' ;

DIGIT     : [0-9] ;
STRING    : [a-zA-Z]+ ;
SEMICOLON : ';' ;

WS  : [ \r\n\t] -> channel(HIDDEN) ;

File MyListener.java :

public class MyListener extends QuestionBaseListener {
    QuestionParser parser;
    public MyListener(QuestionParser parser) { this.parser = parser; }

    public void exitFunction(QuestionParser.FunctionContext ctx) {
        System.out.println(">>> in MyListener for function");
        System.out.println(parser.getTokenStream().getText(ctx));
    }
}

File test.java :

import org.antlr.v4.runtime.*;

import org.antlr.v4.runtime.tree.*;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.IOException;
import java.util.*;

public class test {
    public static class UnderlineListener extends BaseErrorListener {
        public void syntaxError(Recognizer<?, ?> recognizer,
                                Object offendingSymbol,
                                int line, int charPositionInLine,
                                String msg,
                                RecognitionException e)
        {
            System.err.println("line " + line + ":" + charPositionInLine + " " + msg);
            underlineError(recognizer,(Token)offendingSymbol,
                line, charPositionInLine);
        }

        protected void underlineError(Recognizer recognizer,
                                      Token offendingToken, int line,
                                      int charPositionInLine) {
            CommonTokenStream tokens =
                (CommonTokenStream)recognizer.getInputStream();
            String input = tokens.getTokenSource().getInputStream().toString();
            String[] lines = input.split("\n");
            String errorLine = lines[line - 1];
            System.err.println(errorLine);
            for (int i=0; i<charPositionInLine; i++) System.err.print(" ");
            int start = offendingToken.getStartIndex();
            int stop = offendingToken.getStopIndex();
            if ( start>=0 && stop>=0 ) {
                for (int i=start; i<=stop; i++) System.err.print("^");
            }
            System.err.println();
        }
    }

    public static void main(String[] args) throws IOException {
        ANTLRInputStream input = new ANTLRFileStream(args[0]);
        QuestionLexer lexer = new QuestionLexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        QuestionParser parser = new QuestionParser(tokens);
        parser.removeErrorListeners(); // remove ConsoleErrorListener
        parser.addErrorListener(new UnderlineListener()); // add ours
        ParseTree tree = parser.question();
        System.out.println("---parsing ended");
        ParseTreeWalker walker = new ParseTreeWalker();
        MyListener my_listener = new MyListener(parser);
        System.out.println(">>>> about to walk");
        walker.walk(my_listener, tree);
    }
}

File t.text :

test { return 2+2 }
test { if 2 == 2 then return 2 else return 3 }
test { if 2 == 2 then print 2 else print 3 return 4 }
test {return 2; print 3}
test { if 2 == 2 then return 2; abc else return 3; def }
test { if 2 == 2 then if 6 then print 6 else print 3 else if 5 then
print 1; print 2 else print 3 return 4 }

Execution :

$ java test t.text 
Question last update 1302
line 4:14 mismatched input ';' expecting {'}', '==', '-', '+'}
test {return 2; print 3}
              ^
line 5:30 mismatched input ';' expecting {'else', '==', '-', '+'}
test { if 2 == 2 then return 2; abc else return 3; def }
                              ^
line 5:49 mismatched input ';' expecting {'}', '==', '-', '+'}
test { if 2 == 2 then return 2; abc else return 3; def }
                                                 ^
---parsing ended
>>>> about to walk
>>> in MyListener for function
test { return 2+2 }
>>> in MyListener for function
test { if 2 == 2 then return 2 else return 3 }
>>> in MyListener for function
test { if 2 == 2 then print 2 else print 3 return 4 }
>>> in MyListener for function
test {return 2; print 3}
>>> in MyListener for function
test { if 2 == 2 then return 2; abc else return 3; def }
>>> in MyListener for function
test { if 2 == 2 then if 6 then print 6 else print 3 else if 5 then print 1; print 2 else print 3 return 4 }
BernardK
  • 3,674
  • 2
  • 15
  • 10
  • This requires all if-blocks to have a return statement at the end, even if the the `if` statement isn't the last statement in the function. It also doesn't produce an error if both cases of the `if` have a return statement and there's a statement after the `if`. To actually solve this in the grammar you need different rules for ifs that appear at the end of a function and ones that don't. You also need to take into account that `if`s, which aren't at the end, may have a return in one of the cases, but not both (at least that's the sensible rule and how it works in other languages). – sepp2k Nov 03 '17 at 03:20
  • @sepp2k Right, updated for one case. I leave the OP provide all the cases if he is interested by this solution. But as you suggested in your first comment, it would be better to di it in the semantic analysis phase. – BernardK Nov 03 '17 at 12:12