0

I am trying to build a compiler using antlr and for now I want to limit the length of the identifiers in my language to less than 9.

My code now look like this:

IDENTIFIER: CHAR(CHAR|INT)*;

where CHAR and INT are both fragments. I am wondering if there is any convenient way for me to achieve my goal instead of using this:

IDENTIFIER: CHAR(CHAR|INT)(CHAR|INT)...(CHAR|INT);//repeate (CHAR|INT) 8 times.

Thanks for help.

Bob Fang
  • 6,963
  • 10
  • 39
  • 72
  • 1
    You might want to have a look at [this SO](http://stackoverflow.com/questions/3056441/what-is-a-semantic-predicate-in-antlr3) question. There is an example on Semantic Predicates that has an example that's similar to what you're asking. – nvlass Apr 02 '13 at 09:01
  • @nvlass thanks I will have a look. – Bob Fang Apr 02 '13 at 09:12

1 Answers1

1

You should to implement this using a separate check after the lexer is complete. If you attempt to validate the length of your identifier inside the lexer, then input containing Identifier2Long will likely do the following:

  1. Fail to parse Identifier and recover by discarding the I.
  2. Fail to parse dentifier2 and recover by discarding the d.
  3. ...more of this
  4. Finally, successfully parse fier2Long as an identifier and return that as the next token.

You could implement the check by overriding Lexer.nextToken.

Sam Harwell
  • 97,721
  • 20
  • 209
  • 280