I'm trying to code a context-sensitive lexer rule using ANTLR but can't get it to do what I need. The rule needs to match 1 of 2 alternatives based on characters found in the beginning of the rule. Below is greatly simplified version of the problem.
This example grammar:
lexer grammar X;
options
{
language = C;
}
RULE :
SimpleIdent {ctx->someFunction($SimpleIdent);}
(
{ctx->test != true}?
//Nothing
| {ctx->test == true}?
SLSpace+ OtherText
)
;
fragment SimpleIdent : ('a'..'z' | 'A'..'Z' | '_')+;
fragment SLSpace : ' ';
fragment OtherText : (~'\n')* '\n';
I would expect the lexer to exit this rule if ctx->test is false, ignoring any characters after SimpleIdent. Unfortunately ANTLR will test the character after SimpleIdent before the predicate is tested and thus will always take the second alternative if there is a space there. This is clearly shown in the C code:
// X.g:10:3: ({...}?|{...}? ( SLSpace )+ OtherText )
{
int alt2=2;
switch ( LA(1) )
{
case '\t':
case ' ':
{
alt2=2;
}
break;
default:
alt2=1;
}
switch (alt2)
{
case 1:
// X.g:11:5: {...}?
{
if ( !((ctx->test != true)) )
{
//Exception
}
}
break;
case 2:
// X.g:13:5: {...}? ( SLSpace )+ OtherText
{
if ( !((ctx->test == true)) )
{
//Exception
}
How can I force ANTLR to take a specific path in the lexer at runtime?