I have implemented combinatorial GLR parsers. Among them there are:
char(·)
parser which consumes specified character or range of characters.many(·)
combinator which repeats specified parser from zero to infinite times.
Example: "char('a').many()"
will match a string with any number of "a"
-s.
But many(·)
combinator is greedy, so, for example, char('{') >> char('{') >> char('a'..'z').many() >> char('}') >> char('}')
(where ">>"
is sequential chaining of parsers) will successfully consume the whole "{{foo}}some{{bar}}"
string.
I want to implement the lazy version of many(·)
which, being used in previous example, will consume "{{foo}}"
only. How can I do that?
Edit:
May be I confused ya all. In my program a parser is a function (or "functor" in terms of C++) which accepts a "step" and returns forest of "steps". A "step" may be of OK type (that means that parser has consumed part of input successfully) and FAIL type (that means the parser has encountered error). There are more types of steps but they are auxiliary.
Parser = f(Step) -> Collection of TreeNodes of Steps.
So when I parse input, I:
Compose simple predefined Parser functions to get complex Parser function representing required grammar.
Form initial Step from the input.
Give the initial Step to the complex Parser function.
Filter TreeNodes with Steps, leaving only OK ones (or with minimum FAIL-s if there were errors in input).
Gather information from Steps which were left.