Let's say inside your a
rule all ACCEPTED_SYMBOLS
chars are valid but inside rule b
the =
is not valid.
You could do this using a predicate like this:
a
: ACCEPTED_SYMBOLS
;
b
: t=ACCEPTED_SYMBOLS {!$t.text.equals("=")}?
;
ACCEPTED_SYMBOLS
: '~' | '!' | '@' | '#' | '$' | '%' | '^' | '-' | '+' | '=' |
'\\' | ':' | '"' | '\'' | '<' | '>' | ',' | '.' | '?' | '/'
;
Note that only single quote and backslashes need to be escaped inside a literal-string in an ANTLR grammar.
Or, without a predicate:
a
: any
;
b
: SYMBOLS
;
any
: SYMBOLS
| EQ
;
SYMBOLS
: '~' | '!' | '@' | '#' | '$' | '%' | '^' | '-' | '+' |
'\\' | ':' | '"' | '\'' | '<' | '>' | ',' | '.' | '?' | '/'
;
EQ
: '='
;
EDIT
Note that you cannot define the rules in the following order:
ACCEPTED_SYMBOLS: ('~' |'!' |'@' |'#' |'$' |'%' |'^' |'-' |'+' | '=' |
'\\'|':' |'"'|'\''|'<' |'>' |',' |'.' |'?' | '/' ) ;
ACCEPTED_SYMBOLS_EXCEPT_EQUAL: ('~' |'!' |'@' |'#' |'$' |'%' |'^' |'-' |'+' |
'\\'|':' |'"'|'\''|'<' |'>' |',' |'.' |'?' | '/' ) ;
ANTLR will throw an error that the token ACCEPTED_SYMBOLS_EXCEPT_EQUAL
can never be created since prior rule(s) will already match everything ACCEPTED_SYMBOLS_EXCEPT_EQUAL
can match.
And if you'd switch the rules:
ACCEPTED_SYMBOLS_EXCEPT_EQUAL: ('~' |'!' |'@' |'#' |'$' |'%' |'^' |'-' |'+' |
'\\'|':' |'"'|'\''|'<' |'>' |',' |'.' |'?' | '/' ) ;
ACCEPTED_SYMBOLS: ('~' |'!' |'@' |'#' |'$' |'%' |'^' |'-' |'+' | '=' |
'\\'|':' |'"'|'\''|'<' |'>' |',' |'.' |'?' | '/' ) ;
then the rule ACCEPTED_SYMBOLS
can only ever match a '='
. All other characters will be tokenized as ACCEPTED_SYMBOLS_EXCEPT_EQUAL
tokens.
You must realize that the lexer operates independently from the parser: it simply creates tokens going through the lexer rules from top to bottom, trying to match as much as possible, and it does not care what the parser at that time is trying to match.