In a grammar I would like to implement texts without string delimiting xxx
.
The idea is to define things like
a = xxx;
instead of
a ="xxx";
to simplify typewriting. Otherwise there should be variable definitions and other kind of stuff as well.
As a first approach I experimented with this grammar:
grammar SpaceNoSpace;
prog: stat+;
stat:
'somethingelse' ';'
| typed description* content
;
typed:
'something' '-'
| 'anotherthing' '-'
;
description:
'someSortOfDetails' COLON ID HASH
| 'otherSortOfDetails' COLON ID HASH
;
content:
contenttext ';'
;
contenttext:
(~';')*
;
COLON: ':' ;
HASH: '#';
SEMI: ';';
SPACE: ' ';
ID: [a-zA-Z][a-zA-z0-9]+;
WS : [ \t\n\r]+ -> channel(HIDDEN);
ANY_CHAR : . ;
This works fine for input files like this:
something-someSortOfDetails: aVariableName#
this is the content of this;
anotherthing-someSortOfDetails: aVariableName#
here spaces are accepted as much as you like;
somethingelse;
But modifying the last line to
somethingelse ;
leads to a syntax error:
line 7:15 extraneous input ' ' expecting ';'
This probably reveals that the lexer rule
WS : [ \t\n\r]+ -> channel(HIDDEN);
is not applied, (but the SPACE rule???).
Otherwise, if I delete the SPACE lexer-rule, the space in "somethingelse ;" is ignored (by lexer-rule WS), so that the parser rule stat : somethingelse as a consequence is detected correctly. But as a consequence of the deleted SPACE-rule the content text will be reduced to single in-between-spaces, so "this here" will be reduced to "this here".
This is not a big problem, but nevertheless it is an interesting question:
is it possible to implement context-sensitive WS or SPACE lexer rules:
within the content parser-rule any space should be preserved, in any other rule spaces should be ignored.
Is this possible to define such a context-sensitive lexer-rule behavior in ANTLR4?