-1

i'm trying to get two regular expressions for the next xml string:

<string name="mytag1">mycontent1</string>
<string name="mytag2">mycontent2</string>
<string name="mytag3">mycontent3</string>

My first need is to extract all tags, resulting:

mytag1
mytag2
mytag3

The second one is to extract all contents, resulting:

mycontent1
mycontent2
mycontent3

I've tried a lot of regex with no success, any ideas? know that it is a bit tricky... Thanks!!

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
Billyjoker
  • 729
  • 1
  • 10
  • 31
  • 1
    `([^<>]*)` grab the string you want from index 1 and index 2. Don't parse html files with regex. – Avinash Raj Mar 10 '15 at 11:33
  • thx for your answer but i've just tested on an online tester and it results blank... – Billyjoker Mar 10 '15 at 11:36
  • 3
    If your requirement is extracting data from XML file then regex is a bad idea. Use a XML parser for the job. – RaviH Mar 10 '15 at 11:36
  • 1
    That's because http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Tobia Mar 10 '15 at 11:37
  • 1
    @Billyjoker really https://regex101.com/r/iF7dV0/4 ? – Avinash Raj Mar 10 '15 at 11:38
  • 1
    Ohh, it is ok, i will use a parser instead...thx – Billyjoker Mar 10 '15 at 11:38
  • 1
    *`I've tried a lot of regex`* show us some of them! – Wolf Mar 10 '15 at 12:34
  • Here's a tutorial to get you started with XML Parsing in Java: http://www.vogella.com/tutorials/JavaXML/article.html You should really always avoid regex for this type of stuff. XML (and HTML) are not Regular Languages, therefore Regular Expressions aren't the best tool for the job. – JNYRanger Mar 10 '15 at 13:15

1 Answers1

0

If the structure of the string tags in your XML is really that flat and easy[1], you may use regex:

<string name="(mytag\d+)">([^<]*)</string>

The parentheses catch

  • \1 the name attribute
  • \2 the content (provided that there is nothing nested in it)

[1] Normally XML exceeds the power of regex. Regular expressions are good for fast lexicographic analysis, but hopelessly overburdened with recursive languages.

Wolf
  • 9,679
  • 7
  • 62
  • 108