How do I translate LR(1) Parse into a Abstract syntax tree?

Question

I have coded a table driven LR(1) parser and it is working very well however I am having a bit of a disconnect on the stage of turing a parse into a syntax tree/abstract syntax tree. This is a project that I m very passionate about but I have really just hit a dead end here. Thank you for your help in advance.

Edit: Also my parser just uses a 2d array and an action object that tells it where to go next or if its a reduction where to go and how many items to pop. I noticed that many people use the visitor pattern. Im not sure how they know what type of node to make.

Here is the pushdown automata for context

 while (lexer.hasNext() || parseStack.size() > 0) {
        Action topOfStack = parseStack.peek();
        token = parseStack.size() > 0 ? lexer.nextToken() : new Token(TokenType.EOF, "EOF");
        topOfStack.setToken(token);

        int row = topOfStack.getTransitionIndex();
        int column = getTerminalIndex(token.getLexeme());

        column = token.getType() == TokenType.IDENTIFIER
                && !terminalsContain(token.getLexeme()) ? 0 : column;

        Action action = actionTable[row][column];

        if (action instanceof Accept) {
            System.out.println("valid parse!!!!!!");
        } else if (action instanceof Reduction) {
            Reduction reduction = (Reduction) action;

            popStack(parseStack, reduction.getNumberOfItemsToPop());
            column = reduction.getTransitionIndex();
            row = parseStack.peek().getTransitionIndex();

            parseStack.push(new Action(gotoTable[row][column]));
            lexer.backupTokenStream();
        } else if (action != null) {
            parseStack.push(actionTable[row][column]);
        } else {
            System.out.println("Parse error");
            System.out.println("On token: " + token.getLexeme());
            break;
        }

Chris Dodd · Accepted Answer · 2015-05-26T05:35:21.303

3

Each reduction in the LR parsing process corresponds to an internal node in the parse tree. The rule being reduced is the internal AST node, and the items popped off the stack correspond to the children of that internal node. The item pushed for the goto corresponds to the internal node, while those pushed by shift actions correspond to leaves (tokens) of the AST.

Putting all that together, you can easily build an AST by createing a new internal node every time you do a reduction and wiring everything together appropriately.

edited May 26 '15 at 05:35

answered May 25 '15 at 20:47

Chris Dodd

119,907
13
134
226

Thank you very much. I will try and go implement this right away. – Devin Wall May 25 '15 at 20:57
Just for clarification does each shift create a new leaf node? – Devin Wall May 26 '15 at 03:22
1

Yes, the leaves of the AST correspond to tokens, and each shift is a single token. – Chris Dodd May 26 '15 at 05:27
I have one more question. I added it to the initial question marked with "Edit:" Once again thank you for all your help. – Devin Wall May 26 '15 at 05:33
1

You know the type from the rule being reduced. – Chris Dodd May 26 '15 at 05:35
Ok so I do need to include the rules in the parser. I was just using objects in an array that had numbers to say where to go in the array and how many states to pop of the stack. And each rule gets its own node type, correct? Sorry for asking so many questions. Thank you for all your help. – Devin Wall May 26 '15 at 05:40

How do I translate LR(1) Parse into a Abstract syntax tree?

1 Answers1

Linked