6

In ANTLR I want to define a rule like this:

rule : ( a b c | a c b | b a c | b c a | c a b | c b a );

But in my case I have 10 rules instead of three, that I want to permute so it gets very impractical. Is there any way of expressing this in ANTLR without having to write all the permutations?

Lesmana
  • 25,663
  • 9
  • 82
  • 87
Paralife
  • 6,116
  • 8
  • 38
  • 64
  • Could you define a rule like things : a | b | c and then do rule: things things things? – giltanis Jul 30 '11 at 15:10
  • No, that would allow a a a or b b c, which is unacceptable. What you suggest is just (a | b | c){3} (although I dont know if ANTLR supports {x,y} notation for bounded repetition. – Paralife Jul 30 '11 at 15:11
  • @Paralife, no, ANTLR does not support `X{n}` or `X{m,n}` for repetition, unfortunately. – Bart Kiers Jul 30 '11 at 18:57

1 Answers1

8

I would just match any a, b or c once or more:

rule
 : ( a | b | c )+
 ;

and then, after parsing, traversing the parse tree and checking if a, b and c all matched exactly once.

But Yes, it is possible in the grammar itself by using predicates where needed.

A demo:

grammar Permutation;

parse
  :  permutation[5] {System.out.println("parsed: " + $permutation.text);} EOF
  ;

permutation[final int n]
@init{
  java.util.Set set = new java.util.HashSet();
  int counter = n;
}
  :  (
       {counter > 0}?=> token   // keep matching a `token` as long as `counter > 0`
       {                        //
         set.add($token.text);  // add the contents of `token` to `set`
         counter--;             // decrease `counter`
       }                        //
     )+ 
     {set.size() == n}?         // if `set.size() != n`, an exception is thrown
  ;

token
  :  A
  |  B
  |  C
  |  D
  |  E
  ;

A : 'A';
B : 'B';
C : 'C';
D : 'D';
E : 'E';

Space : ' ' {skip();};

The demo grammar above uses 2 different types of predicates: 1) a gated semantic predicate i to make sure that the permutation rule matches no more than the parameter final int n tokens, and 2) a validating semantic predicate i to ensure that the set holds exactly the final int n elements to ensure that it's a proper permutation of the 5 tokens.

More info about semantic predicates can be found here: What is a 'semantic predicate' in ANTLR?

You can test the grammar with the following class:

import org.antlr.runtime.*;

public class Main {
  public static void main(String[] args) throws Exception {
    PermutationLexer lexer = new PermutationLexer(new ANTLRStringStream(args[0]));
    PermutationParser parser = new PermutationParser(new CommonTokenStream(lexer));
    parser.parse();
  }
}
java -cp antlr-3.3.jar org.antlr.Tool Permutation.g 
javac -cp antlr-3.3.jar *.java

java -cp .:antlr-3.3.jar Main "A B C D E"
parsed: ABCDE

java -cp .:antlr-3.3.jar Main "B D C E A"
parsed: BDCEA

java -cp .:antlr-3.3.jar Main "A B C D B"
line 1:9 rule permutation failed predicate: {set.size() == n}?
parsed: null
Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
  • Thanks a lot. This makes me wonder if there could be a wrapping of your solution into a transparent operator (I ve seen in some notations use || for this: ( a||b ) => ( ab | ba) ) for exactly this purpose. Like expanding || internally to the token rule and the action code above... – Paralife Jul 30 '11 at 20:37
  • @Paralife, you're welcome. Where have you seen `a||b` being used (that stands for `ab | ba`) w.r.t. grammars used by parser generators? – Bart Kiers Jul 30 '11 at 20:59
  • I ve seen it as a shortcut notation, not implemented in a parser. Here is where I saw it: http://www.cs.vu.nl/grammarware/help.html – Paralife Jul 30 '11 at 21:21