1

My original string looks like this <Information name="Identify" id="IdentifyButton" type="button"/>

from this string, how do extract 3 substrings string name_part = Identify, string id_part="IdentifyButton", type_part="button"

yezy
  • 49
  • 6
  • 2
    Does this answer your question? [Right way to split an std::string into a vector](https://stackoverflow.com/questions/5607589/right-way-to-split-an-stdstring-into-a-vectorstring) – Den-Jason Sep 22 '20 at 21:23
  • no , actually i don't have just one delimiter, i agree that all the strings i need are enclosed in` ```" "``` but sometimes my original string may or may not contain either name or id, so i need to know which is which. – yezy Sep 22 '20 at 21:28
  • how do i get the first ```"``` after ```name="``` ? – yezy Sep 22 '20 at 21:29
  • OK maybe try a tokeniser: https://stackoverflow.com/questions/53849/how-do-i-tokenize-a-string-in-c – Den-Jason Sep 22 '20 at 21:32
  • 1
    Alternatively, iterate the string, feeding the characters through a state machine that checks for the delimiters, so that spaces in strings are handled properly. With that you'll be able to identify what start-and-end indices to copy from the XML to extract the part and push to a string vector. See https://stackoverflow.com/questions/9438209/for-every-character-in-string I presume you don't want to just use `tinyxml` to parse it for you. – Den-Jason Sep 22 '20 at 21:39
  • this looks like a part of xml. Best approach is to use some xml parser. There are many ready solutions. – Marek R Sep 22 '20 at 21:39
  • Also see https://stackoverflow.com/questions/9473235/reading-an-xml-file-in-a-c-program – Den-Jason Sep 22 '20 at 21:42
  • _Assuming_ you have no spaces in the strings, you can 1) strip the leading `<` and terminating `/>` parts; 2) tokenize the remainig contents by splitting on spaces; 3) tokenize every part into a name+value pair by splitting it on the 'equals' sign `=`; 4) optionally strip the leading and terminating double quote marks from the value string. – CiaPan Sep 22 '20 at 22:46

2 Answers2

1

Assuming you don't want to use third-party XML parsers, you can simply use std::string's find() for each of your names:

int main()
{
  std::string s("<Information name = \"Identify\" id = \"IdentifyButton\" type = \"button\" / >");
  std::string names[] = { "name = \"" , "id = \"" , "type = \"" };
  std::string::size_type posStart(0), posEnd(0);
  for (auto& n : names)
  {
    posStart = s.find(n, posEnd) + n.length();
    posEnd = s.find("\"", posStart);
    std::string part = s.substr(posStart, posEnd - posStart);
    std::cout << part << std::endl;
    posEnd++;
  }
}

Add error checking per your tolerance :)

Vlad Feinstein
  • 10,960
  • 1
  • 12
  • 27
1

You could use a regex to extract key-value pairs separated by a '=' with optional space characters in between:

(\S+?)\s*=\s*([^ />]+)

[^ />]+ captures a value consisting of characters other than space, / and >. This will capture values with or without quotes.

Then use std::regex_iterator, a read-only forward iterator, that will call std::regex_search() with the regex. Here's an example:

#include <string>
#include <regex>
#include <iostream>

using namespace std::string_literals;

int main()
{
    std::string mystring = R"(<Information name="Identify" id="IdentifyButton" type="button" id=1/>)"s;
    std::regex reg(R"((\S+?)\s*=\s*([^ />]+))");
    auto start = std::sregex_iterator(mystring.begin(), mystring.end(), reg);
    auto end = std::sregex_iterator{};

    for (std::sregex_iterator it = start; it != end; ++it)
    {
        std::smatch mat = *it;
        auto key = mat[1].str();
        auto value = mat[2].str();
        std::cout << key << "_part=" << value << std::endl;
    }
}

Output:

name_part="Identify"
id_part="IdentifyButton"
type_part="button"
id_part=1

Here's a Demo. Requires at least C++11.

jignatius
  • 6,304
  • 2
  • 15
  • 30