2

Why this parser leave 'b' in attributes, even if option wasn't matched?

using namespace boost::spirit::qi;

std::string str = "abc";

auto a = char_("a");
auto b = char_("b");
qi::rule<std::string::iterator, std::string()> expr;
expr = +a >> -(b >> +a);

std::string res;

bool r = qi::parse(
        str.begin(),
        str.end(),
        expr >> lit("bc"),
        res
);

It parses successfully, but res is "ab".

If parse "abac" with expr alone, option is matched and attribute is "aba".

Same with "aac", option doesn't start to match and attribute is "aa".

But with "ab", attribute is "ab", even though b gets backtracked, and, as in example, matched with next parser.

UPD

With expr.name("expr"); and debug(expr); I got

<expr>
  <try>abc</try>
  <success>bc</success>
  <attributes>[[a, b]]</attributes>
</expr>
Mikhail Cheshkov
  • 226
  • 2
  • 14

2 Answers2

3

Firstly, it's UB to use the auto variables to keep the expression templates, because they hold references to the temporaries "a" and "b" [1].

Instead write

expr = +qi::char_("a") >> -(qi::char_("b") >> +qi::char_("a"));

or, if you insist:

auto a = boost::proto::deep_copy(qi::char_("a"));
auto b = boost::proto::deep_copy(qi::char_("b"));
expr = +a >> -(b >> +a);

Now noticing the >> lit("bc") part hiding in the parse call, suggests you may expect backtracking to on succesfully matched tokens when a parse failure happens down the road.

That doesn't happen: Spirit generates PEG grammars, and always greedily matches from left to right.


On to the sample given, ab results, even though backtracking does occur, the effects on the attribute are not rolled back without qi::hold: Live On Coliru

Container attributes are passed along by ref and the effects of previous (successful) expressions is not rolled back, unless you tell Spirit too. This way, you can "pay for what you use" (as copying temporaries all the time would be costly).

See e.g.

<a>
  <try>abc</try>
  <success>bc</success>
  <attributes>[a]</attributes>
</a>
<a>
  <try>bc</try>
  <fail/>
</a>
<b>
  <try>bc</try>
  <success>c</success>
  <attributes>[b]</attributes>
</b>
<a>
  <try>c</try>
  <fail/>
</a>
<bc>
  <try>bc</try>
  <success></success>
  <attributes>[]</attributes>
</bc>
Success: 'ab'

[1] see here:

Community
  • 1
  • 1
sehe
  • 374,641
  • 47
  • 450
  • 633
  • But, I don't get it, you use binary minus? That's different language, isn't it? – Mikhail Cheshkov Oct 03 '14 at 20:28
  • @MikhailCheshkov I was just noticing this typo. **Updated** the answer. Excuse my mistake :/ – sehe Oct 03 '14 at 20:34
  • I expect backtracking on non-matched characters - option will fail if no a was found after b, so b will not be consumed, and it will be matched by `lit("bc")`. – Mikhail Cheshkov Oct 03 '14 at 20:44
  • And I've done it with `hold` on option content, thanks for links! – Mikhail Cheshkov Oct 03 '14 at 20:48
  • @MikhailCheshkov I was still editing the answer. Sorry for delays, I'm having to divide attention here. I'll be done sone. [tester here](http://coliru.stacked-crooked.com/a/17771944b428cb30) – sehe Oct 03 '14 at 20:52
2

Quoting @sehe from this SO question

A string attribute is a container attribute and many elements could be assigned into it by different parser subexpressions. Now for efficiency reasons, Spirit doesn't rollback the values of emitted attributes on backtracking.

So, I've put optional parser on hold, and it's done.

expr = +qi::char_("a") >> -(qi::hold[qi::char_("b") >> +qi::char_("a")]);

For more information see mentioned question and hold docs

Community
  • 1
  • 1
Mikhail Cheshkov
  • 226
  • 2
  • 14
  • That info is right in my answer. Mmmm. Maybe because I didn't show the usage of `hold` there. Ah well, +1 (thanks for putting up with my absent mindedness) – sehe Oct 03 '14 at 21:03