2

I am trying to parse a date using Antlr4 with C# as a target. A valid date in my case should have the following

  • be in year / month / day format
  • year MUST have only 4 digits
  • month and day MUST have only 2 digits

I know that similar questions are already here, but their solutions does not seem to work for me

I have read somewhere that there is a priority-like parsing, where top-most rules based on how the grammar file is written are evaluated first. So consider that apart from dates my grammar should also be able to parse integers.

The grammar I have and works (but it does not follow the aforementioned rules is the following)

/*
 *  Parser Rules
 */

dateFormat :  DECIMAL '/' DECIMAL '/' DECIMAL
   ;

/*
 *  Lexer Rules
 */

DECIMAL:            DEC_DIGIT+;

fragment DEC_DIGIT:    [0-9];

I tried to put something like

YEAR or year : DEC_DIGIT DEC_DIGIT DEC_DIGIT DEC_DIGIT;

in either lexer or parser rules but it did not work.

Any ideas / suggestions ?

Note: Please do not suggest alternatives of regex or argue on whenever I should use Antlr or not.

Community
  • 1
  • 1
Athafoud
  • 2,898
  • 3
  • 40
  • 58

1 Answers1

2

I suppose this should work. date parse rule for dates, integer rule for integers.

date
    : year=FOUR_DIGITS SLASH month=TWO_DIGITS SLASH day=TWO_DIGITS
    ;

integer
    : INTEGER
    | FOUR_DIGITS
    | TWO_DIGITS
    ;

FOUR_DIGITS: DIGIT DIGIT DIGIT DIGIT;
TWO_DIGITS:  DIGIT DIGIT;
INTEGER:     DIGIT+;
SLASH:       '/';

fragment     DIGIT: [0-9];
Ivan Kochurkin
  • 4,413
  • 8
  • 45
  • 80
  • I do not know if this is the case but in my example the integer (aka DECIMAL) is defined as a lexer rule and not as parser – Athafoud Apr 26 '16 at 12:26
  • 1
    Lexer is not able to count chars. If we want to have lexer DECIMAL rule we should also have integer parse rule for handling FOUR_DIGITS and TWO_DIGITS (they used in date parse rule). – Ivan Kochurkin Apr 27 '16 at 18:19
  • FYI I just tested my assumption that `top-most rules based on how the grammar file is written are evaluated first`. If I change you grammar and put the `INTEGER` lexer rule, before the `FOUR_DIGITS` rule, the grammar will fail to parse the dates. – Athafoud May 10 '16 at 06:40