1

At Stackoverflow there are several Questions & Answers which are related to use {boost, std}::string_view, e.g.:

  • parsing from std::string into a boost::string_view using boost::spirit::x3 with overloading x3's move_to

    namespace boost { namespace spirit { namespace x3 { namespace traits {
    
    template <typename It>
    void move_to(It b, It e, boost::string_view& v)
    {
        v = boost::string_view(&*b, e-b);
    }
    
    } } } }
    
  • parse into vector using boost::spirit::x3 where further the information about the compatibility of the attributes is given

    namespace boost { namespace spirit { namespace x3 { namespace traits { 
        template <>
        struct is_substitute<raw_attribute_type,boost::string_view> : boost::mpl::true_
        {};
    } } } }
    

llonesmiz wrote an example at wandbox which compiles with boost 1.64, but failed with boost 1.67 now with

    opt/wandbox/boost-1.67.0/gcc-7.3.0/include/boost/spirit/home/x3/support/traits/container_traits.hpp:177:15: error: 'class boost::basic_string_view<char, std::char_traits<char> >' has no member named 'insert'
                 c.insert(c.end(), first, last);
                 ~~^~~~~~

the same error I faced in my project.

The problem raises also by use of std::string even with the explicite use of Sehe's as<> "directive", see also at wandbox:

    #include <iostream>
    #include <string>
    #include <string_view>

    namespace boost { namespace spirit { namespace x3 { namespace traits {

    template <typename It>
    void move_to(It b, It e, std::string_view& v)
    {
        v = std::string_view(&*b, e-b);
    }

    } } } } // namespace boost


    #include <boost/spirit/home/x3.hpp>


    namespace boost { namespace spirit { namespace x3 { namespace traits {

    template <>
    struct is_substitute<raw_attribute_type, std::string_view> : boost::mpl::true_
    {};

    } } } } // namespace boost


    namespace parser
    {
        namespace x3 = boost::spirit::x3;
        using x3::char_;
        using x3::raw;

        template<typename T>
        auto as = [](auto p) { return x3::rule<struct _, T>{ "as" } = x3::as_parser(p); };

        const auto str = as<std::string_view>(raw[ +~char_('_')] >> '_');
        const auto str_vec  = *str;
    }

    int main()
    {
        std::string input = "hello_world_";

        std::vector<std::string_view> strVec; 
        boost::spirit::x3::parse(input.data(), input.data()+input.size(), parser::str_vec, strVec);

        for(auto& x : strVec) { std::cout << x << std::endl; }
    }

As far I've seen, the problem starts with boost 1.65. What has been changed and how to fix it?

Finaly, I've a question about the requirement of contiguous storage mentioned by sehe: I understand the requirements of this, but does the parser can violate this? - In my opinion the parser has to fail even on backtracking, so this can't happen by spirit. By use of the error_handler, the memory storage address which refers the string_view is at last on parse level valid. I conclude it's save to use string_view as far the references are in the scope under this condition, isn't it?

Olx
  • 163
  • 8

1 Answers1

0

The problem here seems to be with the is_container trait:

template <typename T>
using is_container = mpl::bool_<
    detail::has_type_value_type<T>::value &&
    detail::has_type_iterator<T>::value &&
    detail::has_type_size_type<T>::value &&
    detail::has_type_reference<T>::value>;

In Qi, that would have been specializable:

template <> struct is_container<std::string_view> : std::false_type {};

However in X3 it started being a template alias, which cannot be specialized.

This is a tough issue, as it seems that there is simply no customization point to get X3 to do what we need here.

Workaround

I've tried to dig deeper. I have not seen a "clean" way around this. In fact, the attribute coercion trick can help, though, if you use it to "short out" the heuristic that causes the match:

  • the attribute is "like a" container of "char"
  • the parser could match such a container

In this situation we can coerce the parser's attribute to specifically be non-compatible, and things will start working.

Correctly Overriding move_to

This, too, is an area of contention. Simply adding the overload like:

template <typename It>
inline void move_to(It b, It e, std::string_view& v) {
    v = std::string_view(&*b, std::distance(b,e));
}

is not enough to make it the best overload.

The base template is

template <typename Iterator, typename Dest>
inline void move_to(Iterator first, Iterator last, Dest& dest);

To actually make it stick, we need to specialize. However, specializing and function templates is not a good match. In particular, we can't partially specialize, so we'll end up hard-coding the template arguments:

template <>
inline void move_to<Iterator, std::string_view>(Iterator b, Iterator e, std::string_view& v) {
    v = std::string_view(&*b, std::distance(b,e));
}

This is making me question whether move_to is "user-serviceable" at all, much like is_container<> above, it just seems not designed for extension.

I do realize I've applied it in the past myself, but I also learn as I go.

Coercing: Hacking the System

Instead of declaring the rule's attribute std::string_view (leaving X3's type magic room to "do the right thing"), let's etch in stone the intended outcome of raw[] (and leave X3 to do the rest of the magic using move_to):

namespace parser {
    namespace x3 = boost::spirit::x3;
    const auto str 
        = x3::rule<struct _, boost::iterator_range<Iterator> >{"str"}
        = x3::raw[ +~x3::char_('_')] >> '_';
    const auto str_vec  = *str;
}

This works. See it Live On Wandbox

Prints

hello
world

Alternative

That seems brittle. E.g. it'll break if you change Iterator to char const* (or, use std::string const input = "hello_world_", but not both).

Here's a better take (I think):

namespace boost { namespace spirit { namespace x3 {

    template <typename Char, typename CharT, typename Iterator> 
    struct default_transform_attribute<std::basic_string_view<Char, CharT>, boost::iterator_range<Iterator>> {
        using type = boost::iterator_range<Iterator>;

        template <typename T> static type pre(T&&) { return {}; }

        static void post(std::basic_string_view<Char, CharT>& sv, boost::iterator_range<Iterator> const& r) {
            sv = std::basic_string_view<Char, CharT>(std::addressof(*r.begin()), r.size());
        }
    };

} } }

Now, the only hoop left to jump is that the rule declaration mentions the iterator type. You can hide this too:

namespace parser {
    namespace x3 = boost::spirit::x3;

    template <typename It> const auto str_vec = [] {
        const auto str 
            = x3::rule<struct _, boost::iterator_range<It> >{"str"}
            = x3::raw[ +~x3::char_('_')] >> '_';
        return *str;
    }();
}

auto parse(std::string_view input) {
    auto b = input.begin(), e = input.end();
    std::vector<std::string_view> data;
    parse(b, e, parser::str_vec<decltype(b)>, data);
    return data;
}

int main() {
    for(auto& x : parse("hello_world_"))
        std::cout << x << "\n";
}

This at once demonstrates that it works with non-pointer iterators.

Note: for completeness you'd want to statically assert the iterator models the ContiguousIterator concept (c++17)

Final Version Live

Live On Wandbox

#include <iostream>
#include <string>
#include <string_view>
#include <boost/spirit/home/x3.hpp>

namespace boost { namespace spirit { namespace x3 {

    template <typename Char, typename CharT, typename Iterator> 
    struct default_transform_attribute<std::basic_string_view<Char, CharT>, boost::iterator_range<Iterator>> {
        using type = boost::iterator_range<Iterator>;

        template <typename T> static type pre(T&&) { return {}; }

        static void post(std::basic_string_view<Char, CharT>& sv, boost::iterator_range<Iterator> const& r) {
            sv = std::basic_string_view<Char, CharT>(std::addressof(*r.begin()), r.size());
        }
    };

} } }

namespace parser {
    namespace x3 = boost::spirit::x3;

    template <typename It> const auto str_vec = [] {
        const auto str 
            = x3::rule<struct _, boost::iterator_range<It> >{"str"}
            = x3::raw[ +~x3::char_('_')] >> '_';
        return *str;
    }();
}

auto parse(std::string_view input) {
    auto b = input.begin(), e = input.end();
    std::vector<std::string_view> data;
    parse(b, e, parser::str_vec<decltype(b)>, data);
    return data;
}

int main() {
    for(auto& x : parse("hello_world_"))
        std::cout << x << "\n";
}
sehe
  • 374,641
  • 47
  • 450
  • 633
  • That the rule declaration mentions the iterator type is a burden since I implement a BNF using the style shown by the x3 examples e.g. calc9, but with ~300 rules. I have to carry the iterator type up to the top level rule. A solution would be to use the typedef [e.g. from parser config](https://github.com/boostorg/spirit/blob/develop/example/x3/calc/calc9/config.hpp) and parametrize the `string_view` rules. Otherwise, this problem with `string_view` rises from time to time until `string_view` is a first citizen member of x3 as of e.g. `std::vector`. The best is probably to use x3::raw as is. – Olx May 14 '18 at 15:09
  • ... and use iterator_range with the iterator to be instanced later. Anyway, your answer solves the problem! – Olx May 14 '18 at 15:11
  • Oh, yeah it's a pity, and you can fight your way out of the corner using enough template meta-programming. However, for now I think my quick fix shown under "You can hide this too:" already achieves the goal of "don't repeat yourself" w.r.t. the iterator type. – sehe May 14 '18 at 15:18
  • And I agree that there is a problem to be fixed with attribute forwaring to string_view in x3, if you can postpone the "neat" solution until such a time, that's probably wise. Do consider filing an issue at the mailing list or https://github.com/boostorg/spirit – sehe May 14 '18 at 15:19