1

I am trying to write a boost::spirit::x3 parser which, rather than producing the sub-strings (for instance), instead produces offsets and lengths of the matches strings in the source.

I have tried various combinations of on_success handlers, semantic actions, and nothing has really worked.

given:

ABC\n
DEFG\n
HI\n

I'd like a parser which produced a std::vector<boost::tuple<size_t, size_t>> containing:

0,3
4,4
9,2

where clearly it gets more complicated as we match specific substrings on each line, rather than just taking the whole thing.

Is this possible?

experquisite
  • 879
  • 5
  • 14
  • Does it help you to search [my answers where I use spirit Qi with boost::string_ref](https://stackoverflow.com/search?q=user%3A85371+spirit+string_ref+)? The same applies to X3 but the mechanics of semantic actions are - obviously - different in X3. Several answers are very close to what you describe. – sehe May 03 '16 at 07:57

1 Answers1

2

Here's a quick draft.

I've replaced tuple<p, len> with a POD struct because the interaction between x3::raw[] and fusion/adapted/std_tuple.hpp is such that you need to specialize traits::move_to anyways.

In such cases I hugely prefer a user-defined custom type to specialize on, rather than special casing some generic standard library types that could collide with other uses elsewhere.

So, let the struct be

using It = char const*;
struct Range {
   It data;
   size_t size;
};

Then, to parse the following sample input:

char const input[] = "{ 123, 234, 345 }\n{ 456, 567, 678 }\n{ 789, 900, 1011 }";

We need nothing more than a simple grammar:

x3::raw ['{' >> (x3::int_ % ',') >> '}'] % x3::eol

And a dito trait specialization:

namespace boost { namespace spirit { namespace x3 { namespace traits {
    template <> void move_to<It, Range>(It b, It e, Range& r) { r = { b, size_t(e-b) }; }
} } } }

Full Demo

Live On Coliru

#include <boost/spirit/home/x3.hpp>
#include <iostream>

using It = char const*;
struct Range {
   It data;
   size_t size;
};

namespace boost { namespace spirit { namespace x3 { namespace traits {
    template <> void move_to<It, Range>(It b, It e, Range& r) { r = { b, size_t(e-b) }; }
} } } }

int main() {
    char const input[] = "{ 123, 234, 345 }\n{ 456, 567, 678 }\n{ 789, 900, 1011 }";

    std::vector<Range> ranges;

    namespace x3 = boost::spirit::x3;
    if (x3::phrase_parse(
            std::begin(input), std::end(input), 
            x3::raw ['{' >> (x3::int_ % ',') >> '}'] % x3::eol,
            x3::blank,
            ranges)
        )
    {
        std::cout << "Parse results:\n";
        for (auto const& r : ranges) {
            std::cout << "(" << (r.data-input) << "," << r.size << ")\n";
        }
    } else {
        std::cout << "Parse failed\n";
    }
}

Prints:

Parse results:
(0,17)
(18,17)
(36,18)
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Why does it fail to compile if I switch the rule inside the raw to `+(~x3::char_("\r\n"))` ? It goes back to trying to create the Range from a char, rather than move_to'ing it ? – experquisite May 03 '16 at 20:50
  • Seems like the age-old single-element sequence conundrum. It should be gone in the develop branch. Finding the link – sehe May 03 '16 at 20:58
  • @experquisite Sorry to keep you waiting: http://boost.2283326.n4.nabble.com/Single-element-attributes-in-X3-still-broken-tp4681549p4683318.html – sehe May 03 '16 at 21:08
  • Great, thanks! I am using x3 in boost 1.60 - would this fix get into 1.61 ? – experquisite May 04 '16 at 00:26