Best practice for boost spirit context-dependent grammar rule

Question

Just a sample to clarify the issue .. (this is pseudo code)

Classic way: Just make a rule for every path. So starting at "start" and selecting outer_rule1 or outer_rule2 and from there going into inner_rule1 and inner_rule2. You can clearly see that the inner rules are nearly equal. E.g. thing about a grammar where the line special separated is one given by symbol ":" and once given by ";"

inner_rule1 = a >> b >> ":"
inner_rule2 = a >> b >> ";"

outer_rule1 = "X" >> inner_rule1
outer_rule2 = "Z" >> inner_rule2

start=outer_rule1 | outer_rule2

You can overcome this issue by placing the separator on top level

inner_rule1 = a >> b
inner_rule2 = a >> b

outer_rule1 = "X" >> inner_rule1 >> ":"
outer_rule2 = "Z" >> inner_rule2 >> ";"

start=outer_rule1 | outer_rule2

but if the inner rules are more complex the separator maybe used also inside a nested rule and now it becomes tricky to use the same rules but exchange the separator ...

complex_inner1= w >> ";"
complex_inner2= r >> ":"

inner_rule1 = a >> +complex_inner1
inner_rule2 = a >> +complex_inner2

outer_rule1 = "X" >> inner_rule1
outer_rule2 = "Z" >> inner_rule2

start=outer_rule1 | outer_rule2

The question is how to make something like this, in this case e.g. with a custom action but we know that custom actions are not the best choice especially when backtracking is used.

complex_inner1= w >> separator
complex_inner2= r >> separator

inner_rule1 = a[separator=";"] >> +complex_inner1
inner_rule2 = a[separator=":"] >> +complex_inner2

outer_rule1 = "X" >> inner_rule1
outer_rule2 = "Z" >> inner_rule2

start=outer_rule1 | outer_rule2

sehe · Accepted Answer · 2021-04-06T14:43:06.163

You forgot to specify which version of Spirit (Qi or X3) (again?).

So, here goes:

Spirit Qi: Inherited Attributes

You can inject state into the rules by using Inherited Attributes or Locals.

Demo using the first:

Live On Compiler Explorer

#include <boost/spirit/include/qi.hpp>
#include <fmt/ranges.h>

namespace qi = boost::spirit::qi;

using Attr = std::vector<int>;

template <typename It>
struct Parser : qi::grammar<It, Attr()> {
    Parser() : Parser::base_type(start) {
        using namespace qi;
        inner  = int_ % lit(_r1);
        outer1 = 'X' >> inner(':');
        outer2 = 'Y' >> inner(';');
        start  = skip(space)[outer1 | outer2];
    }

  private:
    qi::rule<It, Attr()> start;
    qi::rule<It, Attr(), qi::space_type> outer1, outer2;
    qi::rule<It, Attr(char), qi::space_type> inner;
};

int main() {
    using It = std::string::const_iterator;
    Parser<It> p;

    for (std::string const& s : {
             "",
             " Y 7 ",
             "X 7:-4:+99 ",
             "Y 7 ; 42 ",
         })
    {
        It f = begin(s), l = end(s);

        Attr v;
        bool ok = parse(f, l, p, v);

        fmt::print("Parsed: {} {}, remaining: '{}'\n", ok, v,
                   std::string(f, l));
    }
}

Prints

Parsed: false {}, remaining: ''
Parsed: true {7}, remaining: ' '
Parsed: true {7, -4, 99}, remaining: ' '
Parsed: true {7, 42}, remaining: ' '

X3: Function Composition

In X3 many of the limitations of Qi fall away because it's much easier to compose rules. You'd write almost the same but differently:

Live On Compiler Explorer

#include <boost/spirit/home/x3.hpp>
#include <fmt/ranges.h>

namespace x3 = boost::spirit::x3;

using Attr = std::vector<int>;

namespace Parser {
    using namespace x3;
    auto inner  = [](auto delim) { return int_ % delim; };
    auto outer1 = 'X' >> inner(':');
    auto outer2 = 'Y' >> inner(';');
    auto start  = skip(space)[outer1 | outer2];
} // namespace Parser

int main() {
    using It = std::string::const_iterator;

    for (std::string const& s : {
            "",
            " Y 7 ",
            "X 7:-4:+99 ",
            "Y 7 ; 42 ",
        })
    {
        It f = begin(s), l = end(s);

        Attr v;
        bool ok = parse(f, l, Parser::start, v);

        fmt::print("Parsed: {} {}, remaining: '{}'\n", ok, v,
                std::string(f, l));
    }
}

Printing the same.

To be fair, this glosses over a deal of intricacies that /happen/ to not be important for the sample given. But that's the benefit of non-pseudo code: it has details. If you run into any of the subtler issues down the road, I hope you'll be back with concrete code :)

sorry for using pseudo code here .. and thank you for posting full code. I play with your sample and it seems that the attribute is automatically inherited to all sub rules. Is this assumption correct? `inner2 = int_ % lit(_r1);` `inner1 = inner2;` `outer1 = 'X' >> inner1(':');` `outer2 = 'Y' >> inner1(';');` `start = skip(space)[outer1 | outer2]; ` — Markus, Apr 06 '21 at 14:38
I don't see what you mean. I see no subrules of `inner[12]` in my code or your comment? (Also, no automatic inheritance exists, you will always have to explicitely pass them; `qi::locals` are different in that respect.) — sehe, Apr 06 '21 at 14:41
in the code above (comment) i use inner1=inner2 but only calling inner1 with the additional attribute .. anyway it seems that inner2 gets the attribute also — Markus, Apr 06 '21 at 14:51
Ah. Missed that. That's because you accidentally used the copy constructor instead of rule initialization. You should be using `inner1 = inner2(_r1);` (or `inner1= inner2.alias();` if you had no inherited attributes). This is one of those "EDSL is a leaky abstraction" gotchas. — sehe, Apr 06 '21 at 14:53

Best practice for boost spirit context-dependent grammar rule

1 Answers1

Spirit Qi: Inherited Attributes

X3: Function Composition