Alright, so assuming we are given the following using
and alias namespace directives:
using namespace boost::spirit::qi;
namespace phx = boost::phoenix;
And given the string:
std::string strLinesRecur = "%%DocumentNeededResources: CMap (90pv-RKSJ-UCS2C)";
We would like to extract the "code" inside the parenthesis into res
:
std::string res;
One way to do this is to use boost::phoenix::ref
as semantic action.
So given a code grammar as:
using boost::spirit::ascii::alnum;
auto code = copy(+(alnum | char_('-')));
(Which is along the lines of what in a regex would be [a-zA-Z\-]
)
We can create our own grammar for the whole string:
using boost::spirit::ascii::alpha;
auto grammar = copy(
(char_('%') >> char_('%') >> +alpha >> char_(':'))
>> +alpha >> char_('(') >> as_string[lexeme[code]][phx::ref(res) = _1] >> char_(')'));
Which parses anything that begins with two %
, follows with some alphabetic characters and a :
, then follows with some "code" within parenthesis.
The whole point to this is as_string[lexeme[code]][phx::ref(res) = _1]
. If we break it down: lexeme[code]
just says to treat the parsed code
as an atomic unit, as_string
"returns" the result as std::string
(as opposed to std::vector<char>
) and [phx::ref(res) = _1]
uses semantic actions to store the parsed string into res
(_1
is a placeholder for the first match within that grammar).
In this case spaces are skipped by the following call:
using boost::spirit::ascii::blank;
phrase_parse(begin(strLinesRecur), end(strLinesRecur), grammar, blank);
Live demo
This is of course just an example of a grammar that would fit the string.
Note: copy
refers to qi::copy
and it's one way to be able to store pieces of grammars like in the objects code
and grammar
. Without that the use of auto
will fail (probably with a segmentation fault).