1

We try parse simple number/text(in text present numbers, so we must split input sequence, into 2 elements type(TEXT and NUMBER) vector) grammar where number can be in follow format:

+10.90
10.90
10
+10
-10

So we write grammar:

struct CMyTag
{
    TagTypes tagName;
    std::string tagData;
    std::vector<CMyTag> tagChild;
};
BOOST_FUSION_ADAPT_STRUCT(::CMyTag, (TagTypes, tagName) (std::string, tagData) (std::vector<CMyTag>, tagChild))

template <typename Iterator>
struct TextWithNumbers_grammar : qi::grammar<Iterator, std::vector<CMyTag>()>
{
    TextWithNumbers_grammar() :
        TextWithNumbers_grammar::base_type(line)
    {
        line = +(numbertag | texttag);

        number = qi::lexeme[-(qi::lit('+') | '-') >> +qi::digit >> *(qi::char_('.') >> +qi::digit)];
        numbertag = qi::attr(NUMBER) >> number;

        text = +(~qi::digit - (qi::char_("+-") >> qi::digit));
        texttag = qi::attr(TEXT) >> text;
    }

    qi::rule<Iterator, std::string()> number, text;
    qi::rule<Iterator, CMyTag()> numbertag, texttag;
    qi::rule<Iterator, std::vector<CMyTag>()> line;
};

Everything work fine, but if we try to parse this line:

wernwl kjwnwenrlwe +10.90+ klwnfkwenwf

We got 3 elements vector as expected, but last element in this vector will be with text(CMyTag.tagData):

++ klwnfkwenwf

Additional symbol "+" added. We also try to rewrite grammar to simple skip number rule:

text = qi::skip(number)[+~qi::digit];

But parser died with segmentation fault exception

Cœur
  • 37,241
  • 25
  • 195
  • 267
tantra35
  • 79
  • 7
  • If qi::lexeme change to qi::raw additional in number rule, symbol '+' disappear. Mystics – tantra35 May 26 '15 at 00:16
  • Not mystics. The raw disregards all propagated attributes. Lexeme didn't do anything in the first place (see http://stackoverflow.com/questions/17072987/boost-spirit-skipper-issues/17073965#17073965). Posting in a minute – sehe May 26 '15 at 00:18

1 Answers1

2

Attribute values are not rolled back on backtracking. In practice this is only visible with container attributes (such as vector<> or string).

In this case, the numbertag rule is parsed first and parses the + sign. Then, the number rule fails, and the already-matched + is left in the input.

I don't know exactly what you're trying to do, but it looks like you just want:

line      = +(numbertag | texttag);

numbertag = attr(NUMBER) >> raw[double_];
texttag   = attr(TEXT)   >> raw[+(char_ - double_)];

For the input "wernwl kjwnwenrlwe +10.90e3++ klwnfkwenwf" it prints

Parse success: 5 elements
TEXT    'wernwl kjwnwenrlwe '
NUMBER  '+10.90'
TEXT    'e'
NUMBER  '3'
TEXT    '++ klwnfkwenwf'

Live Demo

Live On Coliru

#include <boost/spirit/include/qi.hpp>
namespace qi = boost::spirit::qi;

enum TagTypes { NUMBER, TEXT, };

struct CMyTag {
    TagTypes tagName;
    std::string tagData;
};
BOOST_FUSION_ADAPT_STRUCT(::CMyTag, (TagTypes, tagName) (std::string, tagData))

template <typename Iterator>
struct TextWithNumbers_grammar : qi::grammar<Iterator, std::vector<CMyTag>()>
{
    TextWithNumbers_grammar() : TextWithNumbers_grammar::base_type(line)
    {
        using namespace qi;
        line      = +(numbertag | texttag);

        numbertag = attr(NUMBER) >> raw[number];
        texttag   = attr(TEXT)   >> raw[+(char_ - number)];
    }

  private:
    template <typename T>
        struct simple_real_policies : boost::spirit::qi::real_policies<T>
    {
        template <typename It> //  No exponent
            static bool parse_exp(It&, It const&) { return false; }

        template <typename It, typename Attribute> //  No exponent
            static bool parse_exp_n(It&, It const&, Attribute&) { return false; }
    };

    qi::real_parser<double, simple_real_policies<double> > number;
    qi::rule<Iterator, CMyTag()> numbertag, texttag;
    qi::rule<Iterator, std::vector<CMyTag>()> line;
};

int main() {

    std::string const input = "wernwl kjwnwenrlwe +10.90e3++ klwnfkwenwf";
    using It = std::string::const_iterator;

    It f = input.begin(), l = input.end();

    std::vector<CMyTag> data;
    TextWithNumbers_grammar<It> g;

    if (qi::parse(f, l, g, data)) {
        std::cout << "Parse success: " << data.size() << " elements\n";
        for (auto& s : data) {
            std::cout << (s.tagName == NUMBER?"NUMBER":"TEXT")
                      << "\t'" << s.tagData << "'\n";
        }
    } else {
        std::cout << "Parse failed\n";
    }

    if (f!=l)
        std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Realy realy cool!!! Much simpler, and work as expected. I complicated number gramar because double support exponential notation (like 1e+3), which doesn't wanted behavior – tantra35 May 26 '15 at 09:39
  • @RuslanUsifov Oh, in that case use a custom [`real_policies`](http://www.boost.org/doc/libs/1_58_0/libs/spirit/doc/html/spirit/qi/reference/numeric/real.html#spirit.qi.reference.numeric.real._code__phrase_role__identifier__realpolicies__phrase___code__specializations)! See it **[Live On Coliru](http://coliru.stacked-crooked.com/a/b8f1e7bef050e6c5)** (also updated answer) – sehe May 26 '15 at 10:28