2

I'm having trouble in understanding exactly how and when Spirit decides to merge matches into single entities. What I am trying to do is to match a list of words inside double square brackets, and I would like to extract the full text inside the brackets. Example:

[[This is some single-spaced text]] -> "This is some single-spaced text"

My grammar is as follows:

qi::rule<Iterator, std::string()> word  = +(char_ - char_(" []"));
qi::rule<Iterator, std::string()> entry = lit("[[") >> word >> *(char_(' ') >> word) >> lit("]]") >> -qi::eol;

std::string text;
bool r = parse( first, last, entry, text );

However, this parses the example text as follows:

[[This is some single-spaced text]] -> "Thisissomesingle-spacedtext"

I don't understand why this is happening. I'm not using lit for the space, nor any rule or parser seems to ignore whitespace, if I understood Spirit correctly. I'm not sure how to verify that the results of my grammar are the ones I want (for example to avoid having the space in a tuple with each word, instead of being concatenated).

What should I do to obtain the result I want?

Svalorzen
  • 5,353
  • 3
  • 30
  • 54
  • @ruslo that's because you're doing something different :) I guessed it in my answer, this is quite a common trap with the `iostream` library – sehe Jun 08 '14 at 14:14

1 Answers1

4

You're probably using a (string)stream. In that case, you will want to se std::noskipws on the stream:

#include <boost/spirit/include/qi.hpp>
#include <sstream>

namespace qi = boost::spirit::qi;

int main()
{
    typedef boost::spirit::istream_iterator Iterator;

    std::istringstream iss("[[This is some single-spaced text]]");
    qi::rule<Iterator, std::string()> entry = "[[" >> qi::lexeme [ +(qi::char_ - "]]") ] >> "]]";

    // this is key:
    iss >> std::noskipws; // or:
    iss.unsetf(std::ios::skipws);

    Iterator f(iss), l;
    std::string parsed;
    if (qi::parse(f, l, entry >> -qi::eol, parsed))
    {
        std::cout << "Parsed: '" << parsed << "'\n";
    } else
        std::cout << "Failed";

    if (f!=l)
        std::cout << "Remaining: '" << std::string(f,l) << "'\n";
}

Prints

Parsed: 'This is some single-spaced text'
sehe
  • 374,641
  • 47
  • 450
  • 633
  • I've just added the demonstration _and_ a simplification of the grammar that you might like. See it **[Live On Coliru](http://coliru.stacked-crooked.com/a/5f56fd3b7025f790)** too – sehe Jun 08 '14 at 14:07
  • You are absolutely right. I was just so afraid I was misunderstanding Spirit that I didn't think the error could been somewhere else. Thanks! – Svalorzen Jun 08 '14 at 14:07
  • Can I ask you a couple of things? I'm not sure I understand what the lexeme is supposed to do that is not there already. Also this will accept multiple spaces in a row, right? – Svalorzen Jun 08 '14 at 14:10
  • You're right. The `lexeme` is there out of good habit, but it doesn't add anything unless you had a skipper. See [for more info: here](http://stackoverflow.com/questions/17072987/boost-spirit-skipper-issues/17073965#17073965).If you want to be specific about the spaces, of course, check your input. However, the `[[syntax]]` suggests quotation marks already so I figured that would not be required, just my guess. – sehe Jun 08 '14 at 14:13