I am trying to write a small and simple parser for symbolic specifications in C++. I first tried using tools such as Yacc/Bison (which I already know) and ANTLR, but these are much too unwieldy and complex for what I am trying to do.
My language has the following binary operators: = (specification), + (sum), . (product); and the following unary operators: Seq (sequence of an expression), Z(###) (unit element which can be numbered), and E(###) (neutral element which can also be numbered). It should also contain variables. I think the binding priority on sum and product should be the standard priority.
Here is an example of a system of rules:
A = Seq B
B = Z(1) + Z(2).Z(2).C
C = E(1) + Z(4).Z(4).B
which I would like to parse to a associative map, indexing expression trees (Seq B) to a variable name (A).
After several auxiliary attempts (using Python, for instance), I have identified an ideal tool for this, Spirit in the Boost Library. I would like something more tightly integrated in the code. And the following StackOverflow gives most of the elements that I should need:
Boolean expression (grammar) parser in c++
However Spirit is pretty complicated, and I am not helped by the fact that I am not comfortable with the advanced C++ traits that the library uses - or more to the fact, I am not sure how to correct errors given by my compiler. My latest attempt:
https://gist.github.com/anonymous/354bb84e54042a3b54f3
It will not compile. Even if it did, I am not sure I understand how I would manipulate the parsed structure. I am not sure if I am on the right track.
Can somebody help get back on point? Or suggest an alternate reasonable solution?
UPDATE 1: Here is a more precise description of what I want.
A list of rules
"VAR = EXPR"
Where EXPR is:
EXPR := EXPR + EXPR
| EXPR . EXPR
| Seq EXPR
| Z<digits*> (defaults to zero if not placed)
| E<digits*> (defaults to zero if not placed)
| VAR
VAR := alnum+ (excludes Z## and E##)
For instance:
PP = Z + L1 . R1 + L2 . R2 + L3 . R3 + Seq Z3 . Seq Z4
L1 = Z10.Z9
R1 = Z12.PP + E4
...
Would give me an associative table:
[ "PP" -> parsing tree that I can navigate for (Z + L1...)
"L1" -> parsing tree for ...
]
Where PP is:
PP = (((((Z + (L1 . R1)) + (L2 . R2)) + (L3 . R3))) + ((Seq Z3) . (Seq Z4)))
I should say that cv_and_he's debugging was already incredibly useful, but: -- I am not sure my grammar is correct; -- I don't know how to parse the list from a system of equations; -- I have no clue how to exploit/navigate the parsing tree once its been parsed.
My endgoal is to write recursive functions that traverse the tree and make some simulations accordingly.
UPDATE 2: I have managed to get most of the parsing done:
https://gist.github.com/anonymous/b6a63a1aa5caa6461dd6
I wonder if it is correct, or if I am missing a mistake. After that, what I cannot figure out is: - how to best parse a list of such rules (instead of just one); - how to traverse the parsed rules, and modify states (for instance, I would like to renumber the variables, and associate each variable with an integer that is always the same - lambda renumbering).
UPDATE 3: I guess part of my question now is: how can I make a "static visitor" not static, i.e., how can I make it have a state, that is modifiable.
UPDATE 4: I found an answer to the last question here:
How can I use the boost visitor concept with a class containing state variables?
I wonder if it is a sound solution. And I am still left wondering what is the best way of parsing a list of rules.