0

I have a grammar for parsing diverse SQL code.

Problem :

- sometimes, I want it to handle nested comments (ex: Microsoft SQL):

COMMENT: '/*' (COMMENT|.)*? ('*/' | EOF) -> channel(HIDDEN);

- sometimes, I want it to not handle them (ex: Oracle):

COMMENT: '/*' .*? '*/' -> channel(HIDDEN);

I don't want to :

  • make two different grammars

  • compile my grammar twice to get two different lexers and two different parsers.

Best solution would be to have an argument, passed to the lexer/parser, to choose which "COMMENT implementation" to use.

Can I do this ? If yes, how, if not, is there a satisfying solution to my problem ?

Thanks !

Kronos
  • 174
  • 11

1 Answers1

2

You can achieve this using a semantic predicate in your lexer. You will need to (1) split the lexer and parser from each other. (2) Create a base class for the lexer with a boolean field, property, or method that you can set true if you want the lexer to allow nested comments, or false to disallow. For sake of below code, assume you add "bool nested = false;" to the lexer base class. (3) Within your lexer grammar, create one COMMENT rule as shown below. (4) After creating you lexer, assign the "nested" field to true if you want nested comments to be recognized.

COMMENT
   : (
      {nested}? '/*' (COMMENT|.)*? ('*/' | EOF)
      | '/*' .*? '*/') -> channel(HIDDEN)
   ;
kaby76
  • 1,142
  • 1
  • 7
  • 10
  • I strongly don't recommend using predicated before lexer rules. It significantly decreases performance. Place predicate after the token, not before. See aslo https://stackoverflow.com/q/39493251/1046374 for detail. – Ivan Kochurkin Feb 07 '20 at 20:23
  • Thanks for the point. An LHS predicate should be avoided. Adding a check to the tool or editor extensions would be a good thing, something I will do. If it's in the LHS position, the number of predicate evaluations could be about the size of the token stream (all tokens of all channels and including 'skip'). Even predicates implemented as fields will incur a significant performance hit, as the predicate is wrapped in two method calls and two switches. For an example which tests the effect of LHS predicates, I wrote a test [here](https://github.com/kaby76/AntlrExamples/tree/master/perf) – kaby76 Feb 08 '20 at 17:27