boost::spirit::qi preserving white space

Question

I am using this code to parse "k1=v1;k2=v2;k3=v3;kn=vn" string into a map.

    qi::phrase_parse(
      begin,end,
      *(*~qi::char_('=') >> '=' >> *~qi::char_(';') >> -qi::lit(';')),
      qi::ascii::space, dict);

The above code would remove space chars, e.g. "some_key=1 2 3" becomes some_key -> 123

I can't figure out how to remove or what to replace with the fourth parameter: qi::ascii::space

Bacically, I want to preserve the original string (key and value) after splitting by '='.

I do not have much experience/knowledge with spirit. It does require investing time to learn.

Does this answer your question? [Boost spirit skipper issues](https://stackoverflow.com/questions/17072987/boost-spirit-skipper-issues) — Nikita Kniazev, May 01 '20 at 21:06

sehe · Accepted Answer · 2020-05-01T23:55:46.257

If you want no skipper, simply use qi::parse instead of qi::phrase_parse:

qi::parse(
  begin,end,
  *(*~qi::char_(";=") >> '=' >> *~qi::char_(';') >> -qi::lit(';')),
  dict);

However, you likely DO want to selectively skip whitespace. The easiest way is usually to have a general skipper, and then mark the lexeme areas (where you don't allow the skipper):

qi::phrase_parse(
  begin, end,
  *(qi::lexeme[+(qi::graph - '=')]
     >> '='
     >> qi::lexeme[*~qi::char_(';')] >> (qi::eoi|';')),
  qi::ascii::space, dict);

The linked answer does give more techniques/backgrounds on how to work with skippers in Qi

DEMO TIME

Live On Coliru

#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <map>
#include <iomanip>
namespace qi = boost::spirit::qi;

int main() {
    for (std::string const& input : { 
            R"()",
            R"(foo=bar)",
            R"(foo=bar;)",
            R"( foo = bar ; )",
            R"( foo = bar ;
foo 
= qux; baz =

    quux 
corge grault
 thud

; x=)",
            // failing:
            R"(;foo = bar;)",
        })
    {
        std::cout << "-------------------------\n";
        auto f=begin(input), l=end(input);

        std::multimap<std::string, std::string> dict;

        bool ok = qi::phrase_parse(f, l,
          (qi::lexeme[+(qi::graph - '=' - ';')]
             >> '='
             >> qi::lexeme[*~qi::char_(';')]
          ) % ';',
          qi::space,
          dict);

        if (ok) {
            std::cout << "Parsed " << dict.size() << " elements:\n";
            for (auto& [k,v]: dict) {
                std::cout << " - " << std::quoted(k) << " -> " << std::quoted(v) << "\n";
            }
        } else {
            std::cout << "Parse failed\n";
        }

        if (f!=l) {
            std::cout << "Remaining input: " << std::quoted(std::string(f,l)) << "\n";
        }
    }

}

Prints

-------------------------
Parse failed
-------------------------
Parsed 1 elements:
 - "foo" -> "bar"
-------------------------
Parsed 1 elements:
 - "foo" -> "bar"
Remaining input: ";"
-------------------------
Parsed 1 elements:
 - "foo" -> "bar "
Remaining input: "; "
-------------------------
Parsed 4 elements:
 - "baz" -> "quux 
corge grault
 thud

"
 - "foo" -> "bar "
 - "foo" -> "qux"
 - "x" -> ""
-------------------------
Parse failed
Remaining input: ";foo = bar;"

Thank you for the detailed answer. This is really good. I really have no idea about this Expression construct and I will not, unless I start reading spirit tutorial from beginning. But I do understand what it does. From what I see, for the KEY leading left and right whitespace is removed. For the VAL leading left is removed and all other (middle, right) are preserved. I will use it for my solution. It makes mores sense than just using parse() — M.K., May 02 '20 at 01:29
How do I adjust the expression to skip both, leading and trailing white space chars? I think that would make sense, only preserve spaces inside a value string. — M.K., May 04 '20 at 14:23
This feels a bit like cheating, but is probably the most elegant way to achieve this, leveraging the skipper itself: http://coliru.stacked-crooked.com/a/57775008b2b357ce. — sehe, May 04 '20 at 14:30
Because `qi::raw[]` returns the row input, which may not always be what you want (think e.g. supporting character escapes). See e.g. https://stackoverflow.com/questions/7436481/how-to-make-my-split-work-only-on-one-real-line-and-be-capable-to-skip-quoted-pa/7462539#7462539 — sehe, May 04 '20 at 16:04
I am a bit confused. In the answer you give this expression: *(qi::lexeme[+(qi::graph - '=')] >> '='>> qi::lexeme[*~qi::char_(';')] >> (qi::eoi|';')) In the DEMO it is: (qi::lexeme[+(qi::graph - '=' - ';')] >> '=' >> qi::lexeme[*~qi::char_(';')] ) % ';' The output seems to be the same. — M.K., May 04 '20 at 16:09
@M.K.Good spot. I decided that excluding ';' made sense while adding more test cases. I ended up not including all the edge cases because they'd only raise questions about requirements, and they weren't relevant to the answer. You can obviously change the grammar as required :) — sehe, May 04 '20 at 19:32

boost::spirit::qi preserving white space

1 Answers1

DEMO TIME