4

When parsing a language using Boost.Spirit, how can I ensure that I skip

// line comments

/* block
   comments */ and

/* /* nested
   block */ comments */

when reading in the code? At the moment, I just do a phrase_parse into a predefined qi::grammar. I guess what I need is some sort of skipping lexer, right?

Dmitri Nesteruk
  • 23,067
  • 22
  • 97
  • 166
  • Yep. Confix parser might come handy. Here's an example: http://www.boost.org/doc/libs/1_64_0/libs/spirit/repository/example/qi/confix.cpp – Dan Mašek Jun 13 '17 at 19:58

1 Answers1

7

No lexers required.

Here's a sample grammar that implements it: Cross-platform way to get line number of an INI file where given option was found, but regardless you can use a skipper like this:

using Skipper = qi::rule<Iterator>;

Skipper block_comment, single_line_comment, skipper;

single_line_comment = "//" >> *(char_ - eol) >> (eol|eoi);
block_comment = "/*" >> *(block_comment | char_ - "*/") > "*/";

skipper = single_line_comment | block_comment;

Of course if white-space is also skippable, use

skipper = space | single_line_comment | block_comment;

This supports nested block-comments, throwing qi::expectation_failure<> if there is a missing */.

Note that it specifically doesn't support block comments starting in a single-line-comment.

Demo

Live On Coliru

#include <boost/spirit/include/qi.hpp>
namespace qi = boost::spirit::qi;

int main() {
    using Iterator = boost::spirit::istream_iterator;
    using Skipper  = qi::rule<Iterator>;

    Skipper block_comment, single_line_comment, skipper;

    {
        using namespace qi;
        single_line_comment = "//" >> *(char_ - eol) >> (eol|eoi);
        block_comment       = ("/*" >> *(block_comment | char_ - "*/")) > "*/";

        skipper             = space | single_line_comment | block_comment;
    }

    Iterator f(std::cin >> std::noskipws), l;

    std::vector<int> data;
    bool ok = phrase_parse(f, l, *qi::int_, skipper, data);
    if (ok) {
        std::copy(data.begin(), data.end(), std::ostream_iterator<int>(std::cout << "Parsed ", " "));
        std::cout << "\n";
    } else {
        std::cout << "Parse failed\n";
    }

    if (f!=l) {
        std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
    }
}

Which prints:

Parsed 123 456 567 901 

Given the input

123 // line comments 234

/* block 345
   comments */ 456

567

/* 678 /* nested
   789 block */ comments 890 */

901
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Added a demo [Live On Coliru](http://coliru.stacked-crooked.com/a/a0081642f447f926) – sehe Jun 13 '17 at 21:25
  • 1
    For reference, this is the same question in x3 : https://stackoverflow.com/questions/59512152/skippers-in-boost-spirit-x3 . – Michaël Sep 26 '21 at 17:53