3

I have defined my language's BNF and have no idea how to design an AST from it.

For example, from the first several lines of my BNF:

<program> ::= <import declarations>? <class declaration>?
<import declarations> ::= <import declaration> | <import declarations> <import declaration>
<class declaration> ::= class <identifier> <class body>
<import declaration> ::= import <type name> ';'

How can I express this from my AST? Should I design it like this way?

typedef vector<ImportDeclaration*> ImportDeclarationList;

class Program {
    ImportDeclarationList importDeclarations;
    ClassDeclaration classDeclaration;
};

class ImportDeclaration {
    TypeName typeName;
};

class ClassDeclaration {
    Identifier identifer;
    ClassBody classbody;
};

Do I need to add some inheritance among thess classes?

Are there some books about how to design AST from BNF?

BenMorel
  • 34,448
  • 50
  • 182
  • 322
47dev47null
  • 272
  • 6
  • 13
  • Nodes need to contain an "abstract" syntax category (something typically close to the LHS of many rules). See http://stackoverflow.com/a/1916687/120163 for a discussion of AST vs CSTs, and why you might want to have nodes contain a concrete syntax category (e.g., exactly the LHS of rules, or the concrete names of leaves). – Ira Baxter Jan 29 '16 at 14:53

1 Answers1

3

You just have to implement a tree data structure. This means that you will need some class, say, Node, from which all other possible elements of the AST must inherit. You can then use member pointers (Node*). If the amount of children may vary, you can store those in a std::vector. eg. For a really simple production (assuming IntLiteral is a terminal):

Addition := IntLiteral + IntLiteral

It is possible to write the following code for an AST:

struct Node {
    virtual Node* clone() { return new Node(*this);};
    virtual ~Node() {}
};
class IntLiteral : Node {
    int value;
public:
    IntLiteral(int v) : value(v) {}
    virtual IntLiteral* clone()
    {
        return new IntLiteral(*this);
    }
};
class Addition : Node {
    Node* left;
    Node* right;
public:
    Addition(Node* l, Node* r)
        : Node(), left(l->clone()), right(r->clone()) {}
    virtual Addition* clone()
    {
        return new Addition(*this);
    }
};

Assuming that you're hand-implementing, you can add nodes to the root node (in this case of type Addition*) in the accepting function of your parser code.

In reality, you probably want to have a generate function for every Node. Alternatively, and probably this is a better idea, you will need accessors and a tree traverser to generate code.

There are quite a number of books about parsers, one of those would be the classic "Dragon Book" although the focus is actually on compilers there.

Aleph
  • 1,209
  • 10
  • 19