I need to get the start and end index of each rule. I.e., the start index is the character position of the first character of the first token belonging to the rule and the end index is the last character position of the last token belonging to the rule. With these numbers I can crop the result of a rule out of the input file precisely.
The straight-forward way of doing this should be using the $start
and $stop
tokens, i.e., $start.getStartIndex()
and $stop.getStopIndex()
. However, I have encountered that the $stop
token is often null
even when used in the @after
action.
According to the definitive Antlr4 reference the $stop
token is defined as: "The last nonhidden channel token to be matched
by the rule. When referring to the current rule,
this attribute is available only to the after and
finally actions." This sounds as if such token should exist (at least for any rule that matches at least one token). Thus, it is quite strange why this token is null
in many cases (even for rules that have a simple token - not a subrule - as their last token. How can a stop token be null
in this case?
Right now, I am using a workaround by just asking the input about its current token, moving one token back and using this token as stop
token. However, this seems hacky:
@after {
int start = $start.getStartIndex();
int stop = _input.get(_input.index()-1).getStopIndex();
// do something with start and stop
}
The cleaner solution (if stop
was not null) should look like this:
@after {
int start = $start.getStartIndex();
int stop = $stop.getStopIndex();
}