1

I'm currently working on a project which that require me to generate an ANTLR grammar on the fly because the generated language depends on user input. Hence I generate the ANTLR grammar in code, and generate a lexer and parser from it.

My goal is to have an input program that is written in the language of the generated grammar (it is actually created through genetic algorithms, but that's not relevant here), and to ultimately have an AST representing the program. However, currently I'm only able to generate a ParseTree object, and this is not sufficient for my program.

Does anybody know how to use the ANTLR API to generate an object representing the AST? (For example an antlr.collections.AST object). I'll append a piece of code here, but the best way to test it is to run the Eclipse project that resides in https://snowdrop.googlecode.com/svn/trunk/src/ANTLRTest/

public class GEQuorra extends GEModel {

    Grammar grammar;
    private org.antlr.tool.Grammar lexer;
    private org.antlr.tool.Grammar parser;
    private String startRule;
    private String ignoreTokens;

    public GEQuorra(IntegrationTest.Grammar g) {
        grammar = new Grammar(g.getBnfGrammar());

        setGrammar(grammar);

        try {
            ignoreTokens = "WS";
            startRule = "agentProgram";
            parser = new org.antlr.tool.Grammar(g.getAntlrGrammar());

            @SuppressWarnings("rawtypes")
            List leftRecursiveRules = parser.checkAllRulesForLeftRecursion();
            if (leftRecursiveRules.size() > 0) {
                throw new Exception("Grammar is left recursive");
            }

            String lexerGrammarText = parser.getLexerGrammar();
            lexer = new org.antlr.tool.Grammar();
            lexer.importTokenVocabulary(parser);
            lexer.setFileName(parser.getFileName());
            lexer.setGrammarContent(lexerGrammarText);

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    @Override
    public double getFitness(CandidateProgram program) {
        try {

            GECandidateProgram gecp = (GECandidateProgram) program;
             System.out.println("Parsing:" + gecp.getSourceCode());

            CharStream input = new ANTLRStringStream(gecp.getSourceCode());
            Interpreter lexEngine = new Interpreter(lexer, input);
            FilteringTokenStream tokens = new FilteringTokenStream(lexEngine);
            StringTokenizer tk = new StringTokenizer(ignoreTokens, " ");
            while (tk.hasMoreTokens()) {
                String tokenName = tk.nextToken();
                tokens.setTokenTypeChannel(lexer.getTokenType(tokenName), 99);
            }

            Interpreter parseEngine = new Interpreter(parser, tokens);
            ParseTree t;
            t = parseEngine.parse(startRule);

            return 1.0 / t.toStringTree().length();
        } catch (Exception e) {
            // Something failed, return very big fitness, making it unfavorable
            return Double.MAX_VALUE;
        }
    }

Where t.toStringTree() contains the ParseTree.

sjorsvb
  • 39
  • 6
  • "to ultimately have an AST representing the program", as far as I understand AST isn't used to present a program. It is used to represent the parsed data tree. Also, the AST library is designed for internal usage. It's not part of the API. – gigadot Nov 02 '11 at 17:45
  • @BartKiers If you know a way to output a CommonTree it would help a lot. To explain what the user input is you should know that I'm building this language to automatically generate behavior for networks of sensors/robots/etc. Hence, the "input" is to specify what kind of individuals a network consists of (for example: are they able to move? What kind of radio's do they have, short range, long range?) This kind of input decides what the individual is capable of, and hence what can be used in the generated program. – sjorsvb Nov 03 '11 at 08:34
  • @BartKiers Also, the "functions" an individual has should be expendable, but because I'm using Grammatical Evolution (a technique in Genetic Programming) these functions should be a part of the grammar. Is this answer sufficient? – sjorsvb Nov 03 '11 at 08:37
  • @BartKiers Hi Bart, maybe it's more clear if you look at the template file for generating the ANTLR file: http://code.google.com/p/snowdrop/source/browse/trunk/src/ANTLRTest/templates/antlr.template, in which you can see that (among others) the static properties, and dynamic properties that an individual can have are A B or C, also, the number of different values for integers and floats are restricted by only putting in a few in the generated grammar (which has to do with restricting the search space for the genetic algorithm), so that is also a user provided parameter. – sjorsvb Nov 03 '11 at 12:22
  • @BartKiers Hi Bart, thanks for your input. In my original message I tried to hint to the very specific situation I'm working in, without trying to get off the track of the actual question: Does anybody know how to use the ANTLR API to generate an object representing the AST? To be clear, it's not about how to adjust the grammar, but more about what objects to call to get from a .g file to an actual object like a CommonTree or something similar. I think getting into detail any further isn't going to help solving this question, but thank you for your time. Cheers! – sjorsvb Nov 03 '11 at 14:58
  • @sjorsvb, it's just that I often see people asking how to implement a certain solution _they_ think is the proper way to go, while there is a far more straight forward way to tackle said problem. But, fair enough, I'll drop it :) – Bart Kiers Nov 03 '11 at 15:06
  • 1
    See: http://stackoverflow.com/questions/4931346/how-to-output-the-ast-built-using-antlr how to create an AST contrary to a simple parse tree (especially the line `CommonTree tree = (CommonTree)parser.parse().getTree();` in the `Main` class). If that _is_ what you're looking for, you can remove this question, since it's a duplicate in that case. HTH. – Bart Kiers Nov 03 '11 at 15:07

0 Answers0