3

In PetitParser2, how do I match a closed set of tokens, like month names? E.g. (in pseudocode) [ :word | MonthNames anySatisfy: [ :mn | mn beginsWith: word ] ] asParser.

PPPredicateSequenceParser seemed like a possibility, but it seems you have to know the string size in advance. I guess I could just do something like:

| monthRules |
    monthRules := Array streamContents: [ :unamused: |
        MonthNames collect: [ :e | 
            s nextPut: e asString asPParser.
            s nextPut: (e first: 3) asPParser ] ].
    ^ PP2ChoiceNode withAll: monthRules

But I was wondering if there was something built in/straightforward

Sean DeNigris
  • 6,306
  • 1
  • 31
  • 37

2 Answers2

3

Other, more clumsy and less efficient option is to use a custom block:

[ :context | 
    | position names |
    names := #('January' 'February' 'March' 'April').   
    position := context position.
    names do: [ :name | 
        (context next: name size) = name ifTrue: [  
            ^ name
        ] ifFalse: [ 
            context position: position
        ]
    ].
    ^ PP2Failure new
] asPParser parse: 'April'

I would not recommend this though, because PP2 does not know anything about the block and cannot apply any optimizations.

jk_
  • 5,448
  • 5
  • 25
  • 23
2

I would recommend to use a parser for each element in the set:

monthsParser := 'January' asPParser / 
                'February' asPParser / 
                'March' asPParser.
monthsParser parse: 'January'

Alternatively, creating a choice parser from a collection:

names := #('January' 'February' 'March' 'April').
monthsParser := PP2ChoiceNode withAll: (names collect: [ :l | 
                    l asPParser ]).
monthsParser parse: 'January'

The "optimization" of PP2 should choose the right alternative pretty quickly.

jk_
  • 5,448
  • 5
  • 25
  • 23
  • Does this fit your need or is there something I missed? – jk_ Nov 26 '19 at 08:43
  • That's pretty much what I ended up with in the second half of my question, but I was wondering if there's something in-built to "match input of unknown length against an arbitrary block". Thinking more, I guess it'd be pretty inefficient because it would have to consider the entire input, but still curious. If it could be done, maybe one could pass a max input size instead of a fixed input size... – Sean DeNigris Nov 27 '19 at 17:29
  • I see, unfortunatelly, there is nothing like this in PP2 now. I can imagine we could transfrom set of tokens to deterministic finite state automaton to recognize a tokens in a set of tokens in linear time. It would get better performance compared to the choice of of literal tokens (as I suggested). – jk_ Nov 29 '19 at 07:48