I'm using ANTLRv3 to parse input that looks like this:
* this is an outline item at level 1
** item at level 2
*** item at level 3
* another item at level 1
* an item with *bold* text
Stars at the beginning of a line mark the start of an outline item. Stars can also be part of an item's text (e.g. *bold*
).
This is the grammar to parse outline items without support for stars in the item text:
outline_item: OUTLINE_ITEM_MARKER ITEM_TEXT;
OUTLINE_ITEM_MARKER: STAR_IN_COLUMN_ZERO STAR* (' '|'\t');
ITEM_TEXT: ('a'..'z'|'A'..'Z'|'0'..'9'|'\r'|'\n'|' '|'\t')+;
fragment STAR_IN_COLUMN_ZERO: {getCharPositionInLine()==0}? '*';
fragment STAR: {getCharPositionInLine()>0}? '*';
For the input *** foo bar
ANTLR produces the following parse tree:
So far this works as expected. Now I'm trying to add star to the possible characters for the item text, so I changed the lexer rule for ITEM_TEXT
to the following:
ITEM_TEXT: ('a'..'z'|'A'..'Z'|'0'..'9'|'\r'|'\n'|' '|'\t'|STAR)+;
Now for the same input the following parse tree is produced:
This is the output in ANTLRWorks:
input.txt line 1:0 rule STAR failed predicate: {getCharPositionInLine()>0}?
input.txt line 1:1 missing OUTLINE_ITEM_MARKER at '** foo bar'
It seems that OUTLINE_ITEM_MARKER
didn't match due to a MissingTokenException
. What's wrong with the grammar, what do I need to change to allow stars to be part of ITEM_TEXT
?