0

My grammar contains the following (condensed):

block:
: specialfunction

specialfunction
: SPECIALFUNCTION OPAR (parameter (',' parameter)*)? CPAR 
;

SPECIALFUNCTION : 'FUNCTION1'| 'FUNCTION2';

The list of possible values for SPECIALFUNCTION can and will change over time. The names are also used elsewhere in the code so rather than hardcoding them in the grammar and code, I'd like to have a method that returns valid SPECIALFUNCTIONs that can then be called from various places in the code as well as the grammar.

SPECIALFUNCTION : <make a call to get the current list of SPECIALFUNCTIONS e.g. SomeClass.GetListOfNames>

public SomeClass 
{
     public string GetListOfNames()
     {
          return "'FUNCTION1' | 'ANOTHERSPECIALFUNCTION' | 'NEWONE'";
     }
}

Then as new SpecialFunctions are added I'd just add'em to GetListOfNames.

Note I am using C#.

XBond
  • 236
  • 2
  • 10
  • Need to explain more what you perceive the problem to be. You could just change the SPECIALFUNCTION rule to `SPECIALFUNCTION : 'func' NAME ;` – GRosenberg Aug 27 '14 at 00:23
  • 1
    Here's a way to accomplish it in ANTLR3: http://stackoverflow.com/questions/6108293/can-i-add-antlr-tokens-at-runtime ANTLR4's API has changed slightly, so a 1-on-1 translation isn't going to work, but the changes are not that big. – Bart Kiers Aug 27 '14 at 07:04
  • @BartKiers - exactly what I was looking for! Thanks for the help. As always you have been very, very helpful. Just one additional question: The example lists this: Word : {runtimeWordAhead()}?=> ('a'..'z' | 'A'..'Z')+ | 'abc' ; Is there a way to just emit what ever the input was, rather than the ('a'..'z' | 'A'..'Z')?. That way the name is not limited to letters. – XBond Aug 27 '14 at 17:09

2 Answers2

1

As far as I know, you can force the lexer to emit another token type than it's own, see section "Lexer Rule Actions" on this page.

If you modified a general Identifier rule like this:

Identifier : [a-z]+ { if (isSpecialFunction(getText()))  setType(SPECIALFUNCTION );}

This would make certain Identifiers a SPECIALFUNCTION, based on information that will be available after creation of the lexer/parser.

I have to admit I don't know if getText() is the correct method in a lexer action.

Alternatively you could also create the lexer dynamically at runtime and modify the lexer's source by adding additional alternatives.

Another option would be to modify the token stream after lexing by changing the token type of those Identifiers that are a SPECIALFUNCTION.

Yet another option would be to make SPECIALFUNCTION a parser rule and check at parse time if an Identifier is a SPECIALFUNCTION.

Onur
  • 5,017
  • 5
  • 38
  • 54
0

Based on only what you have given, you could just change the SPECIALFUNCTION rule to something like

SPECIALFUNCTION : 'func' NAME      ;
NAME            : [A-Z_] [a-zA-Z_]*;

When you walk the parse tree, you can determine whether the NAME in the SpecialFunction context is acceptable or not.

If you are trying to validate run-time options during the parse, the better option is probably to defer that work to when you walk the parse tree.

To do it during the parse, add a semantic predicate to validate the string matched by NAME

SPECIALFUNCTION : 'func' NAME { isValidSFName($NAME.text) }?     ;

Update: In your last comment, you say that it is better to provide the lexer a list rather than have the lexer check the validity of the name. However, if you think about it, there is no actual difference between having the lexer check a list and having the lexer, in a semantic predicate, check a list. In both cases, as far as the lexer is concerned, the rule will succeed or fail identically.

If, instead, your requirement is for the list of lexer token types to include the changed special function names, the only way to achieve that is through a recompile of the lexer.

Note, if you recompile the lexer, you will need to recompile the parser parser as well. The token type values are not required to remain constant through compile cycles.

Update2: To clarify in view of your comment on Bart's suggestion, the NAME rule in effect accomplishes the same lookahead without resort to user code - just allow the rule to match any string that would be an allowable special function name. Whether the NAME rule might create an ambiguity would depend on the rest of the lexer rules, but in practice is not too difficult to avoid.

GRosenberg
  • 5,843
  • 2
  • 19
  • 23
  • Hopefully my new explanation makes a bit more sense. I actually don't want to check whether it is valid, but provide the list of valid ones. – XBond Aug 27 '14 at 00:54
  • Providing the list as opposed to checking whether a NAME is valid works better in my instance due to other rules and error checking. – XBond Aug 27 '14 at 01:12
  • Unfortunately you explain what you want to do, not why you need to do it this way. Unless the value list is changing faster than every 500ms or so, the only rational way to do what you want is to edit the grammar and recompile. Automate the edit/compile and pull the values from a controlled source. Re your second comment, I have updated my answer. – GRosenberg Aug 27 '14 at 02:51