I am writing grammar for multi line strings (Text Block) in java. The delimiter for the start and end of a text block is triple quotes. I can successfully parse and build the AST for the text blocks and its content, except for one issue: the TEXT_BLOCK_START
token is being returned after the tokens from the second lexer. I am using this as a guide: flow diagram. According to the ANTLR2 documentation, the way that I have implemented this should produce the desired token stream:
TEXT_BLOCK_START
-> content from second lexer, etc...
-> TEXT_BLOCK_END
I have tried changing the order of the action and the delimiter, the order of the rules, and using select()
instead of selector.push()
.
Here are the important parts of the main class:
final Lexer lexer = new Lexer(reader);
lexer.setCommentListener(contents);
final Lexer secondLexer =
new Lexer(lexer.getInputState());
lexer.setTokenObjectClass("antlr.CommonHiddenStreamToken");
secondLexer.setTokenObjectClass("antlr.CommonHiddenStreamToken");
final TokenStreamHiddenTokenFilter filter = new
TokenStreamHiddenTokenFilter(lexer);
final TokenStreamSelector selector = new TokenStreamSelector();
lexer.selector = selector;
secondLexer.selector = selector;
selector.addInputStream(filter, "filter");
selector.addInputStream(secondLexer, "secondLexer");
selector.select(filter);
The lexer (main lexer) rule:
TEXT_BLOCK_START
: "\"\"\"" {selector.push("secondLexer");}
;
The secondary lexer rule:
TEXT_BLOCK_END
: "\"\"\"" {selector.pop();}
;
As stated above, everything parses as expected, except that the token stream looks like this:
content from second lexer, etc...
-> TEXT_BLOCK_END
-> TEXT_BLOCK_START
What am I missing here?