I am trying to extract an xml attribute from a std::string which is basically XML. I do not have the luxury of using an XML parser or anything outside the std, but note that I'm specifically looking just for this specific xml attribute and not really parsing the xml. Integrating a library/parser just for this specific extraction process does not make sense.
A sample string:
<Params>
<Element Name="elem(1)"/>
<Some Value="10"/>
<Element Name="elem(2)" />
<Attr Value="40" />
</Params>
The strings I need to extract are specifically: elem(1) and elem(2)
So to match I'm using the start and end variable
start string is "<Element Name=\"" and string end "\""
I put together this code obviously scouring through many SO articles:
int main()
{
const std::string s = "<Element Name=\"elem(1)\"/> <Some Value=\"10\" Unit=\"m\"/> <Element Name=\"elem(2)\"/> <Attr Value=\"40\" />";
std::string start = "<Element Name=\"";
std::string end = "\"";
std::regex words_regex(start + "(.*)" + end);
auto words_begin = std::sregex_iterator(s.begin(), s.end(), words_regex);
auto words_end = std::sregex_iterator();
std::cout << "Found "
<< std::distance(words_begin, words_end)
<< " words:\n";
for (std::sregex_iterator i = words_begin; i != words_end; ++i) {
std::smatch match = *i;
std::string match_str = match.str();
std::cout << match_str << '\n';
}
}
The problem is it returns the entire string ending at the last double quote. I will handle the part of collecting multiple sub-strings. But first I need to ensure the regex returns at-least the first sub-string correctly.
I've seen many mentions of positive look-ahead with regex and trying to understand it. But I'm not able to get it to work with std::regex yet. Is it fully supported? (Compiling on Visual Studio 2015 and GCC 4.8.2)
Other solutions are also welcome as long as they do not involve third party libraries and are achievable with std C++11 code.