I am trying to use an inherited instance of the generated BaseVisitor
class to construct an AST from the parse tree of the grammar I am using for a simple compiler.
Consider a subset of my grammar where Stat
is a statement of a lightweight language:
...
program: stat;
stat: SkipStat # Skip
| type Ident AssignEq assignRHS # Declare
| assignLHS AssignEq assignRHS # Assign
...
My understanding (as per this post) is to have the visitor call visit(ctx->stat())
where ctx
has type ProgramContext*
. The derived visitor then correctly makes calls to the corresponding overridden visitSkip(..)
, visitDeclare(..)
, etc.
I have simple node classes for my AST, again a subset looks as follows:
struct BaseNode {};
struct Program : BaseNode {
Program(std::shared_ptr<Stat> body) : body(std::move(body)) {}
std::shared_ptr<Stat> body;
};
struct Stat : BaseNode {};
struct Assign : Stat {
Assign(std::shared_ptr<AssignLHS> lhs, std::shared_ptr<AssignRHS> rhs) :
lhs(std::move(lhs)),
rhs(std::move(rhs)) {}
std::shared_ptr<AssignLHS> lhs;
std::shared_ptr<AssignRHS> rhs;
};
struct Declare : Stat {
Declare(std::shared_ptr<Type> type, std::string name, std::shared_ptr<AssignRHS> rhs) :
type(std::move(type)),
name(std::move(name)),
rhs(std::move(rhs)) {}
std::shared_ptr<Type> type;
std::string name;
std::shared_ptr<AssignRHS> rhs;
};
struct Skip : Stat {};
Tying the two points together, I am trying to have the mentioned visitSkip(..)
, visitDeclare(..)
, etc. (which are all of type std::any
) to return std::shared_ptr<Skip>
, std::shared_ptr<Declare>
, etc. such that visitProgram(..)
can receive them from a call to visit
in the form
std::shared_ptr<Stat> stat = std::any_cast<std::shared_ptr<Stat>>visit(ctx->stat());
However (!), std::any
only allows casts with the exact known class and not any derived class, so this approach does not work. Instead, I have started to create my own visitor entirely separate from (i.e. not a child of) the generated visitor.
I assume there is a better solution using the generated classes that I am missing.
Having found this answer to a not dissimilar post, is it even worth constructing an AST? If my idea of how to use antlr4 is inaccurate, please let me know and point me towards a good source I can start from. Thanks.
Edit: As per chapter 7 of the definitive antlr 4 reference, I believe I could achieve what I desire through the use of a stack holding BaseNode*
and casting the popped nodes appropriately. This does not seem like the best solution. Ideally, I would like to achieve something similar to the java implementation method wherein we pass an expected return type into the visitor classes.
Edit 2: I have now implemented such a solution (making use of a listener instead of visitors), example below from the exitAssign(..)
function:
void Listener::exitAssign(Parser::AssignContext* ctx) {
const auto rhs = std::static_pointer_cast<AssignRHS>(m_stack.top());
m_stack.pop();
const auto lhs = std::static_pointer_cast<AssignLHS>(m_stack.top());
m_stack.pop();
m_stack.push(std::make_shared<Assign>(lhs, rhs));
}
I still feel this solution is not the best - it feels very hacky as order of arguments must be popped in reverse, and it is easy to forget to push onto the stack after creating the AST node.
I will use this implementation for now, but again, if a better method is preferred by people who use antlr 4 in c++, please do let me know.