2

I'm trying to write a parser for the language with a little bit weird syntax and stumbled upon a problem with skippers which makes me think that I do not fully understand how they work in Boost.Spirit.X3.

The problem is that for some rules EOLs are meaningful (i.e. I have to match the end of the line to be sure the statement is correct), while for others they are not (thus it can be skipped).

As a result, I decided to use the following definition of the skipper for my root rule:

namespace x3 = boost::spirit::x3;
namespace ch = x3::standard;

using ch::blank;
using x3::eol;

auto const skipper = comment | blank;

where comment just skips comments obviously. In other words, I preserve EOLs in the input stream.

Now, for another rule, I'd like to use the definition like this:

auto const writable_property_declaration_def =
    skip(skipper | eol)
    [
        lit("#")
        > property_type
        > property_id
    ];

The rule itself is a part of one more another rule which is instantiated as following:

BOOST_SPIRIT_INSTANTIATE(property_declaration_type, iterator_type, context_type);

where

using skipper_type = decltype(skipper);

using iterator_type = std::string::const_iterator;
using phrase_context_type = x3::phrase_parse_context<skipper_type>::type;
using error_handler_type = x3::error_handler<iterator_type>;
using context_type = x3::context<x3::error_handler_tag, std::reference_wrapper<error_handler_type>, phrase_context_type>;

And that seems to not work: the EOLs are not skipped.

Now, my questions are the following:

  • What's the connection between boost::spirit::x3::phrase_parse_context and the particular skipper I use?
  • And how does skip(p)[a] actually work?
  • Is it possible to somehow define the underlying rule in such a way that it uses another skipper so that the X3 handles all the EOLs on its own and I don't need to do it manually?

Looking forward to your reply(-ies)! :)

GooRoo
  • 661
  • 3
  • 9

1 Answers1

2

You didn't actually show all declarations, so it's not completely clear how the setup is. So let me mock up something quick:

Live On Wandbox

#define BOOST_SPIRIT_X3_DEBUG
#include <iomanip>
#include <boost/spirit/home/x3.hpp>

namespace x3 = boost::spirit::x3;
namespace P {
    using namespace x3;
    static auto const comment = lexeme [ 
            "/*" >> *(char_ - "*/") >> "*/"
          | "//" >> *~char_("\r\n") >> eol
        ];

    static auto const skipper = comment | blank;

    static auto const property_type = lexeme["type"];
    static auto const property_id = lexeme["id"];

    auto const demo =
        skip(skipper | eol) [
            lit("#")
            > property_type
            > property_id
        ];
}

int main() {
    for (std::string const input : {
            "#type id",
            "#type\nid",
        })
    {
        std::cout << "==== " << std::quoted(input) << " ====" << std::endl;
        auto f = begin(input), l = end(input);
        if (parse(f, l, P::demo)) {
            std::cout << "Parsed successfully" << std::endl;
        } else {
            std::cout << "Failed" << std::endl;
        }

        if (f!=l) {
            std::cout << "Remaining input unparsed: " << std::quoted(std::string(f,l)) << std::endl;
        }
    }
}

As you can see there's not actually a problem unless the rule declarations get involved:

==== "#type id" ====
Parsed successfully
==== "#type
id" ====
Parsed successfully

Let's zoom in from here

static auto const demo_def =
    skip(skipper | eol) [
        lit("#")
        > property_type
        > property_id
    ];

static auto const demo = x3::rule<struct demo_> {"demo"} = demo_def;

Still OK: Live On Wandbox

<demo>
  <try>#type id</try>
  <success></success>
</demo>
<demo>
  <try>#type\nid</try>
  <success></success>
</demo>
Parsed successfully
==== "#type
id" ====
Parsed successfully

So, we know that x3::rule<> is not actually the issue. It's gonna be about the static dispatch based on the tag type (aka rule ID, I think, in this case struct demo_).

Doing the straight-forward:

static auto const demo_def =
    skip(skipper | eol) [
        lit("#")
        > property_type
        > property_id
    ];

static auto const demo = x3::rule<struct demo_> {"demo"};

BOOST_SPIRIT_DEFINE(demo)

Still OK: Live On Wandbox

Hmm what else could be wrong. Maybe if there are conflicing skipper contexts? Replacing

    if (parse(f, l, P::demo)) {

with

    if (phrase_parse(f, l, P::demo, P::skipper)) {

Still OK: Live On Wandbox

So, that's not it either. Ok, let's try the separate instantiation:

Separate Compilation

Live On Wandbox

  • rule.h

    #pragma once
    #define BOOST_SPIRIT_X3_DEBUG
    #include <boost/spirit/home/x3.hpp>
    #include <boost/spirit/home/x3/support/utility/error_reporting.hpp>
    
    namespace x3 = boost::spirit::x3;
    namespace P {
        using namespace x3;
        static auto const comment = lexeme [ 
                "/*" >> *(char_ - "*/") >> "*/"
              | "//" >> *~char_("\r\n") >> eol
            ];
    
        static auto const skipper = comment | blank;
    
        using demo_type = x3::rule<struct demo_>;
        extern demo_type const demo;
    
        BOOST_SPIRIT_DECLARE(demo_type)
    }
    
  • rule.cpp

    #include "rule.h"
    #include <iostream>
    #include <iomanip>
    
    namespace P {
        using namespace x3;
    
        static auto const property_type = lexeme["type"];
        static auto const property_id = lexeme["id"];
    
        static auto const demo_def =
            skip(skipper | eol) [
                lit("#")
                > property_type
                > property_id
            ];
    
        struct demo_ {
            template<typename It, typename Ctx>
                x3::error_handler_result on_error(It f, It l, expectation_failure<It> const& ef, Ctx const&) const {
                    std::string s(f,l);
                    auto pos = std::distance(f, ef.where());
    
                    std::cout << "Expecting " << ef.which() << " at "
                        << "\n\t" << s
                        << "\n\t" << std::setw(pos) << std::setfill('-') << "" << "^\n";
    
                    return error_handler_result::fail;
                }
        };
    
        demo_type const demo {"demo"};
        BOOST_SPIRIT_DEFINE(demo)
    
        // for non-skipper invocation (x3::parse)
        using iterator_type = std::string::const_iterator;
        BOOST_SPIRIT_INSTANTIATE(demo_type, iterator_type, x3::unused_type)
    
        // for skipper invocation (x3::phrase_parse)
        using skipper_type = decltype(skipper);
        using phrase_context_type = x3::phrase_parse_context<skipper_type>::type;
        BOOST_SPIRIT_INSTANTIATE(demo_type, iterator_type, phrase_context_type)
    }
    
  • test.cpp

    #include "rule.h"
    #include <iostream>
    #include <iomanip>
    
    int main() {
        std::cout << std::boolalpha;
        for (std::string const input : {
                "#type id",
                "#type\nid",
            })
        {
            std::cout << "\n==== " << std::quoted(input) << " ====" << std::endl;
    
            {
                auto f = begin(input), l = end(input);
                std::cout << "With top-level skipper: " << phrase_parse(f, l, P::demo, P::skipper) << std::endl;
    
                if (f!=l) {
                    std::cout << "Remaining unparsed: " << std::quoted(std::string(f,l)) << std::endl;
                }
            }
            {
                auto f = begin(input), l = end(input);
                std::cout << "Without top-level skipper: " << parse(f, l, P::demo) << std::endl;
    
                if (f!=l) {
                    std::cout << "Remaining unparsed: " << std::quoted(std::string(f,l)) << std::endl;
                }
            }
        }
    }
    

Prints the expected:

==== "#type id" ====
With top-level skipper: <demo>
  <try>#type id</try>
  <success></success>
</demo>
true
Without top-level skipper: <demo>
  <try>#type id</try>
  <success></success>
</demo>
true

==== "#type
id" ====
With top-level skipper: <demo>
  <try>#type\nid</try>
  <success></success>
</demo>
true
Without top-level skipper: <demo>
  <try>#type\nid</try>
  <success></success>
</demo>
true

Or, without debug enabled:

==== "#type id" ====
With top-level skipper: true
Without top-level skipper: true

==== "#type
id" ====
With top-level skipper: true
Without top-level skipper: true

FINAL THOUGHTS

Sadly, perhaps, I cannot reproduce the symptom you describe. However, I hope some of the steps above do clarify how separate linkage of rule-definition actually work with respect to the skipper/contexts.

If your situation is actually more complicated, I can only think of another situation where the X3 situation may be different from the QI situation. In Qi, a rule statically declared its skipper. In X3, the skipper is strictly from context (and the only way a rule can limit the number of supported skippers is by separating instantiation and hiding the definition in a separate TU).

This means that it is easy to accidentally inherit an overridden skipper. This can be counter-intuitive in e.g. nested rules. I'd suggest not relying on inherited skipper contexts at all if you have different skippers.

sehe
  • 374,641
  • 47
  • 450
  • 633
  • Thank a lot for such a detailed step-by-step analysis! Sorry that I haven't given you more context from my side. Actually, my situation is far more complicated than this, and I just didn't know what is important and relevant and what's not. I'll play with the code you provided a little bit and come back with additional questions (if any) or with a resolution. But as of now, it feels like I missed something important at the very beginning and the problem is definitely somewhere on my side. – GooRoo Dec 29 '19 at 15:22
  • 1
    So, I had checked all the code examples and they indeed work as expected. Then I've cleaned up my code base from all the workarounds and manual matching of each EOL and updated it accordingly with examples. And it works! For those, who were wondering, it's already difficult to say what was the problem as I'm quite far from my initial variant of implementation. Most probably, I tried to match the EOLs that are already consumed from the input stream. Anyway, @sehe, thanks again! – GooRoo Dec 29 '19 at 18:39
  • Cheers. Thanks for letting us know. This is also helpful because sometimes people just need the confidence boost to look at things another time :) – sehe Dec 29 '19 at 19:48