2

For the current grammar I am parsing with X3, whitespace and Perl-style comments are ignored.

It seems to me that a skip parser in X3 is just a normal parser, and whatever input it consumes is considered "skipped." I came up with this:

namespace x3 = boost::spirit::x3;
auto const blank_comment = 
   x3::blank | x3::lexeme[ '#' >> *(x3::char_ - x3::eol) >> x3::eol ];

On parsing a very basic input (a couple comment lines and one quoted string line), this seems to work well. (Live on Coliru)

However, as I can't find any documentation on the matter and the details of current skip parsers are tucked away in an intricate system of templates, I was hoping for some input.

  1. Is this the proper way of defining a "skip parser"? Is there a standard method?
  2. Are there performance concerns with an implementation like this? How would it be improved?

I previously searched SO for the details, and found an answer using Qi (Custom Skip Parser with Boost::Spirit). As I never learned Qi, much of the details are hard to follow. The method I described above seems more intuitive.

Community
  • 1
  • 1
Zac
  • 876
  • 1
  • 8
  • 18

1 Answers1

3

Yeah that's fine.

The skipper seems pretty optimal. You could optimize the quoted_string rule by reordering and using character set negation (operator~):

Live On Coliru

#include <boost/spirit/home/x3.hpp>

namespace parser {
    namespace x3 = boost::spirit::x3;
    auto const quoted_string = x3::lexeme [ '"' >>  *('\\' >> x3::char_ | ~x3::char_("\"\n")) >> '"' ];
    auto const space_comment = x3::space | x3::lexeme[ '#' >> *(x3::char_ - x3::eol) >> x3::eol];
}

#include <iostream>
int main() {
    std::string result, s1 = "# foo\n\n#bar\n   \t\"This is a simple string, containing \\\"escaped quotes\\\"\"";

    phrase_parse(s1.begin(), s1.end(), parser::quoted_string, parser::space_comment, result);

    std::cout << "Original: `" << s1 << "`\nResult: `" << result << "`\n";
}

Prints

Original: `# foo

#bar
    "This is a simple string, containing \"escaped quotes\""`
Result: `This is a simple string, containing "escaped quotes"`
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Thanks for the optimization tip and answer. – Zac Feb 10 '16 at 00:27
  • Maybe for `space_comment` it is better to use [seek directive](http://www.boost.org/libs/spirit/repository/doc/html/spirit_repository/qi_components/directives/seek.html)? – Tomilov Anatoliy Feb 11 '16 at 01:52
  • @Orient I dint recommend depending on unsupported contributor chide fir something that looks a lot like premature optimization. – sehe Feb 11 '16 at 06:46
  • 1
    ok. Phones are hard to use. "I don't recommend depending on unsupported contributor code for" ... – sehe Feb 11 '16 at 12:02
  • Does repository contain bad quality code? I propose it above not for optimization sake, but for readability reasons. – Tomilov Anatoliy Feb 11 '16 at 13:10
  • It's not bad code. It's just not supported. The docs are pretty clear about this. I strongly suspect it's not better in any way here. – sehe Feb 11 '16 at 13:11
  • BTW [seek directive](https://github.com/boostorg/spirit/blob/develop/include/boost/spirit/home/x3/directive/seek.hpp) is a part of main *Spirit X3* code. I just linked the relevant documentation from *Spirit V2* repository. – Tomilov Anatoliy Feb 17 '16 at 03:27