0

This grammar for ANTLR4 should break a document up into two types of substring: wiki and nowiki.

grammar NoWikiText;

nowiki: '<nowiki>' ~'</nowiki>'* '</nowiki>';
wiki: ~'<nowiki>'+;
document: (wiki | nowiki)*;

Here's the input:

<nowiki>2</nowiki>4<nowiki></nowiki>

I get two matches for nowiki. But the text "4", which should match wiki, is ignored. Why?

EDIT:

This seems to work:

grammar NoWikiText;

P1: '<nowiki>';
P2: '</nowiki>';
NP: .;

nowiki: P1 NP* P2;
wiki: NP+;
document: (wiki | nowiki)*;
Tiiba
  • 67
  • 5

1 Answers1

1

In the grammar you posted, only 2 tokens will be created: <nowiki> and </nowiki>. The negations char works differently than you expect: ~'</nowiki>' means: "match any token other than </nowiki>" (so that would match the token <nowiki>). So for your input <nowiki>2</nowiki>4<nowiki></nowiki>, the 2 and 4 are not recognized as valid tokens.

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288