2

I'm working on a new antlr grammar which is similar to nattys and should recognize date expressions, but I have problem with skip rules. In more detail I want to ignore useless "and"s in expressions for example:

Call Sam, John and Adam and fix a meeting with Sarah about the finance on Monday and Friday.

The first two "and"s are useless. I wrote the rule bellow to fix this problem but it didn't work, why? what should I do?

    NW : [~WeekDay];
    UselessAnd : AND NW -> skip;
Saeed Masoumi
  • 8,746
  • 7
  • 57
  • 76
  • I don't think you can compose non-fragment lexer rules like that. – Mephy Oct 21 '16 at 11:00
  • @Mephy so what should i do? i have to fix this because it parses expression with useless ands wrong. could it be possible to fix this problem with code blocks? – rozhin bayati Oct 21 '16 at 11:17
  • You can't negate a word in ANTLR's lexer, only single characters. `[~WeekDay]` matches one of the following characters: `~`, `W`, `e`, `k`, `D`, `a` or `y` (and `~[WeekDay]` matches any character except `W`, `e`, `k`, `D`, `a` and `y`). But ANTLR isn't well suited to parse natural languages. I suggest you do a search on "natural language processing". Note that Natty does not parse complete sentences as you posted. – Bart Kiers Oct 21 '16 at 11:28
  • @BartKiers i think i should mentioned that WeekDay is a lexer rule like this:( WeekDay : "Monday"|"Tuesday"|...|"Sunday") – rozhin bayati Oct 21 '16 at 11:41
  • The same goes for lexer rules: you cannot negate them. You can only negate single characters in a lexer rule. – Bart Kiers Oct 21 '16 at 11:50

1 Answers1

4

"Useless AND" is a semantic concept.

Grammars are about syntax, and handle semantic issues poorly. Don't couple these together.

Suggestion: when you write a grammar for a language, make your parser accept the language as it is, warts and all. In your case, I suggest you "collect" the useless ANDs. That way you can get the grammar "right" more easily, and more transparently to the next coder who has to maintain your grammar.

Once you have the AST, it is pretty easy to ignore (semantically) useless things; if nothing else, you can post-process the AST and remove the useless AND nodes.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341