4

I write a parser to find string concatenation expressions. I have a range of strings which are enclosed by parentheses, originated mainly from a function call.

For example, ("one"+"two"+"three") -> ("one"|"two"|"three") is a simple case and I can handle it.

A more difficult case is (null, "one"+"two"+"three", null) -> (null, "one"|"two"|"three", null), but I'm able parse it with boost::tokenizer.

(null, "one"+"two"+"three,four", 1 /* third parameter can be: 1, 2, 3 */), in such a difficult example I suggest parsing with boost::spirit but I need help in writing some rules for it.

Later:

Seems like escaped_list_separatorfrom the boost::tokenizer is what I need. But I have one problem with it:

   using namespace std;
   using namespace boost;
   string s = "Field 1,\"putting quotes around fields, allows commas\",Field 3";
   tokenizer<escaped_list_separator<char> > tok(s,escaped_list_separator<char>("", ",", "\""));
   for(tokenizer<escaped_list_separator<char> >::iterator beg=tok.begin(); beg!=tok.end();++beg){
       cout <<"~~~"<< *beg << "\n";
   }

removes " for me. It is possible to keep quotes in output like this

Field 1
"putting quotes around fields, allows commas"
Field 3
triclosan
  • 5,578
  • 6
  • 26
  • 50

1 Answers1

2

Basically, you can use operator- with charset matches:

   rule = '"' >> (char_ - '"') >> '"';

Also look at operator ~ to invert a charset.

If you are interested in escaping quotes inside quotes as well, and perhaps commenting styles at the same time, I recommend having a look at my answer here:

Showing (partially) quoted cells in CSV files, including escaped quotes inside strings.

Other items of interest:

Community
  • 1
  • 1
sehe
  • 374,641
  • 47
  • 450
  • 633