I've recently been tasked with writing an ANTLR3 grammar for a fictional language. Everything else seems fine, but I've a couple of minor issues which I could do with some help with:
1) Comments are between '/*'
and '*/'
, and may not be nested. I know how to implement comments themselves ('/*' .* '*/'
), but how would I go about disallowing their nesting?
2) String literals are defined as any sequence of characters (except for double quotes and new lines) in between a pair of double quotes. They can only be used in an output statement. I attempted to define this thus:
output : OUTPUT (STRINGLIT | IDENT) ;
STRINGLIT : '"' ~('\r' | '\n' | '"')* '"' ;
For some reason, however, the parser accepts
OUTPUT "Hello,
World!"
and tokenises it as "Hello, \nWorld
. Where the exclamation mark or closing "
went I have no idea. Something to do with whitespace maybe?
WHITESPACE : ( '\t' | ' ' | '\n' | '\r' | '\f' )+ { $channel = HIDDEN; } ;
Any advice would be much appreciated - thanks for your time! :)