0

Is it possible to search for certain patterns inside std::string in C++?

I am aware of find(), find_first_of() and so on, but I am looking for something more advanced.

For example, take this XML line: <book category="fantasy">Hitchhiker's guide to the galaxy </book>

If I want to parse it using find() I could do it as follows:

string line = "<book category=\"fantasy\">Hitchhiker's guide to the galaxy </book>"

size_t position = line.find('>');
std::string tmp = line.substr(0,position);
position = tmp.find(' ');
std::string node_name = tmp.substr(0, position);
std::string parameters = tmp.substr(position+1);

...

This feels very "grindy" and inefficient. It would also fail, if I had multiple tags nested inside each other, like this:

<bookstore><book category=\"fantasy\">Hitchhiker's guide to the galaxy </book></bookstore>

Is there a way to search for pattern, as in search for <*> where * represents any amount of characters of any value, getting the value of *, splitting it to parameters and node name, then searching for </nodename> and getting the value inbetween <nodename> and </nodename> ?

isklenar
  • 974
  • 2
  • 14
  • 34
  • Why not just use a proper XML parser? Rolling your own is a bit silly considering that someone else already did the work for you. – cdhowie Apr 10 '13 at 20:51
  • 1
    I agree that for parsing XML you should use an XML parser. For the general case of finding patterns in strings, Regular Expressions might be a good choice (but it is generally a bad idea to use RegEx on XML). – Trying Apr 10 '13 at 20:53
  • As the language you seem to be trying to parse isn't a regular one unfortunately there is no chance to find any arbitrarily deep nested structure of tags using regular expressions. For some simple cases you might be able to find the strings you are looking for using regular expressions. For more complicated ones you should use a XML parser instead. – mikyra Apr 10 '13 at 20:55

0 Answers0