Parsing escaped strings with boost spirit

Question

I´m working with Spirit 2.4 and I'd want to parse a structure like this:

Text{text_field};

The point is that in text_field is a escaped string with the symbols '{', '}' and '\'. I would like to create a parser for this using qi. I've been trying this:

using boost::spirit::standard::char_;
using boost::spirit::standard::string;
using qi::lexeme;
using qi::lit;

qi::rule< IteratorT, std::string(), ascii::space_type > text;
qi::rule< IteratorT, std::string(), ascii::space_type > content;
qi::rule< IteratorT, std::string(), ascii::space_type > escChar;


text %= 
  lit( "Text" ) >> '{' >>
    content >>
  "};"
  ;

content %= lexeme[ +( +(char_ - ( lit( '\\' ) | '}' ) )  >> escChar ) ];

escChar %= string( "\\\\" ) 
  | string( "\\{" ) 
  | string( "\\}" );

But doesn't even compile. Any idea?

The compiler error (and the line it's on) would help. – Marcelo Cantos Oct 26 '10 at 21:31 — Marcelo Cantos, Oct 26 '10 at 21:31

score 8 · Accepted Answer · edited May 13 '16 at 16:24

8

Your grammar could be written as:

qi::rule< IteratorT, std::string(), ascii::space_type > text; 
qi::rule< IteratorT, std::string() > content;   
qi::rule< IteratorT, char() > escChar;   

text = "Text{" >> content >> "};";  
content = +(~char_('}') | escChar); 
escChar = '\\' >> char_("\\{}");

i.e.

text is Text{ followed by content followed by }
content is at least one instance of either a character (but no }) or an escChar
escChar is a single escaped \\, {, or }

Note, the escChar rule now returns a single character and discards the escaping \\. I'm not sure if that's what you need. Additionally, I removed the skipper for the content and escChar rules, which allows to leave off the lexeme[] (a rule without skipper acts like an implicit lexeme).

edited May 13 '16 at 16:24

Felix Dombek

13,664
17
79
131

answered Oct 27 '10 at 01:46

hkaiser

11,403
1
30
35

2

Hi, hkaiser and thanks for helping. I've tried your solution but it fails to parse this: Text{ \} }; I thought that it was because the parser ~char_('}') matches the backslash, but I tried the following with no succes: content = +( ~char_( "\\\\}" ) | escChar );. Any idea? – Bruno Oct 27 '10 at 17:19
2

Yeah, right. ~char_('}') does indeed match the backslash. I'm sorry for this oversight. If you change that to ~char_("\\}") it should not do that anymore. – hkaiser Oct 28 '10 at 01:37

Parsing escaped strings with boost spirit

1 Answers1

Linked