In principle, you want the escapes to be interpreted during parsing.
Very rare exceptions would include when you intend to "only validate" and forward the same input. However, if that's the case then you wouldn't want any attributes (which is simple in Spirit: just don't pass one).
Also, it's a security smell because you should probably never trust your input.
There's a some other weirdness:
- you have a grammar with a skipper, and then the only rule is fully lexeme (see Boost spirit skipper issues).
- you handle
\"
but \
has no magic meaning otherwise. That's confusing.
- you have redundant
lit()
wrapping the character literals
char_ - char_('"')
could (should?) be written more efficiently as ~char_('"')
- there's a stray wide-character literal
Collapsing all these issues, I'd write the whole thing as
qi::rule<Iterator, std::string()> rule;
rule = '"' >> *~char_('"') >> '"';
With escapes, I'd write
rule = '"' >> *('\\' >> char_ | ~char_('"')) >> '"';
To expose the raw input:
rule = raw['"' >> *('\\' >> char_ | ~char_('"')) >> '"'];
And you can drop the entire grammar struct.
Illustrative Demo
No answer is complete without a live demo. In particular it hightlights a few of the noted oddities above.
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <iomanip>
namespace qi = boost::spirit::qi;
std::string parse(std::string const& input) {
std::string result;
static const qi::rule<std::string::const_iterator, std::string()> rule
= '"' >> *('\\' >> qi::char_ | ~qi::char_('"')) >> '"';
// throws if expectation failures
qi::parse(input.begin(), input.end(), qi::eps > rule > qi::eoi, result);
return result;
}
int main() {
auto sq = [](auto s) { return std::quoted(s, '\''); };
auto dq = [](auto s) { return std::quoted(s, '"'); };
for (std::string s : {
R"("")",
R"("hello")",
R"("hello \"world\"! ")",
R"("hello \'world\'! ")",
}) {
std::cout << s << " -> " << parse(s) << "\n";
std::cout << sq(s) << " -> " << sq(parse(s)) << "\n";
std::cout << dq(s) << " -> " << dq(parse(s)) << "\n";
std::cout << "----\n";
}
}
Prints
"" ->
'""' -> ''
"\"\"" -> ""
----
"hello" -> hello
'"hello"' -> 'hello'
"\"hello\"" -> "hello"
----
"hello \"world\"! " -> hello "world"!
'"hello \\"world\\"! "' -> 'hello "world"! '
"\"hello \\\"world\\\"! \"" -> "hello \"world\"! "
----
"hello \'world\'! " -> hello 'world'!
'"hello \\\'world\\\'! "' -> 'hello \'world\'! '
"\"hello \\'world\\'! \"" -> "hello 'world'! "
----
I'd like for this to be a Zen Koan. And the Koan ends:
The disciple meditated at the output of the code for 37 days and then he walked away enlightened.