0

I have a string from a file. For example from a XML file.. I just want to know whats the simplest way to format the content from the following tags?

<member names="John:Frank" family="Smith:Wesson"/>

I just want the John Frank Smith Wesson, each one of them as different strings.

herohuyongtao
  • 49,413
  • 29
  • 133
  • 174
user3012765
  • 55
  • 1
  • 1
  • 5
  • Did you tried spiting the string on ":" ? – Ahmed Hamdy Jan 08 '14 at 12:42
  • 1
    I think you can use regexp, but if you just have to do that on XML, use an existing XML parser. – MokaT Jan 08 '14 at 12:43
  • 7
    That's not "formatting" you're looking for, it's parsing / tokenizing. – DevSolar Jan 08 '14 at 12:44
  • are you able to extract the string from the xml tags, first ? – gaurav5430 Jan 08 '14 at 12:44
  • assuming that you can parse the XML and get the strings out, then for each string you could do something similar to [this](https://stackoverflow.com/questions/14265581/parse-split-a-string-in-c-using-string-delimiter-standard-c) (parse using a delimiter) or [this](https://stackoverflow.com/questions/236129/how-to-split-a-string-in-c) (parse using streams etc.). Google is your friend for other options. – Laur Ivan Jan 08 '14 at 12:50

4 Answers4

2

Use a XML parser to parse the xml and then split the values on ':'? Use a parser as RapidXML

MokaT
  • 1,416
  • 16
  • 37
Daan Olislagers
  • 3,253
  • 2
  • 17
  • 35
  • I don't want to use external libraries. – user3012765 Jan 08 '14 at 12:50
  • Then first parse the strings out (e.g. with [regex](http://www.johndcook.com/cpp_regex.html) to identify the strings). – Laur Ivan Jan 08 '14 at 12:54
  • 4
    @user3012765 You definitely want to use an external library, if you want to do things the right way. Check out this brillant answer about manually parsing HTML, which applies exactly the same to XML: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – GabrielF Jan 08 '14 at 14:33
1

In case you like a quick 'n dirty grammar in Boost Spirit:

See it Live on Coliru

#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;

int main()
{
    std::string const example = "<member names=\"John:Frank\" family=\"Smith:Wesson\"/>";

    std::vector<std::string> data;

    if (qi::parse(begin(example), end(example).
             *(
                 qi::omit [ *~qi::char_('"') ] >> '"' >> qi::as_string [ *~qi::char_("\":") ] % ':' >> '"'
              ),
             data))
    {
        for (auto const& item : data)
            std::cout << item << "\n";
    }
}

Output

clang++ -std=c++11 main.cpp && ./a.out
John
Frank
Smith
Wesson
sehe
  • 374,641
  • 47
  • 450
  • 633
0

I would go with substr combined with finders

  • find the first " position in the string.
  • substring from first place found to first place found starting from last first place. You now have 2 positions, opening and closing "
  • split the resulting string based on : the same way you just did.
  • do it while there's opening and closing " in the string
Yabada
  • 1,728
  • 1
  • 15
  • 36
0

Your question is all about string manipulation, nothing to XML parsing.

To get the four names from <member names="John:Frank" family="Smith:Wesson"/> is quite easy, you just need:

  1. Treat <member names="John:Frank" family="Smith:Wesson"/> as a string.

  2. Find two big sub-strings based on ", you will get John:Frank and Smith:Wesson.

  3. For each sub-string, further split them based on :, you will get John, Frank, Smith and Wesson. Done!

herohuyongtao
  • 49,413
  • 29
  • 133
  • 174
  • This is just now the right way to parse XML... http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – GabrielF Jan 08 '14 at 14:35