3

How would I "find" and "get" a value between two strings?

ie: <a>3</a>

I'm reading a file to find the location of <a>, where that starts, then it will stop reading when it finds </a> The value I want to return is "3".

Using JRE 6

Jeremy
  • 22,188
  • 4
  • 68
  • 81
Mario
  • 821
  • 3
  • 9
  • 13
  • 1
    A regular expression will work, FSVO "work". However, HTML/XML parsing (which are related but different) should really be done with an appropriate tool (hopefully one that supports a powerful selector language). –  Aug 20 '11 at 02:03
  • exact duplicate of [Searching for a tag, then saving text between tag as a variable](http://stackoverflow.com/questions/7093716/searching-for-a-tag-then-saving-text-between-tag-as-a-variable) – Ernest Friedman-Hill Aug 20 '11 at 02:04

3 Answers3

12

Your two main options are:

1) preferred but potentially complicated: using an XML/HTML parser and getting the text within the first "a" element. e.g. using Jsoup (thanks @alpha123):

Jsoup.parse("<a>3</a>").select("a").first().text(); // => "3"

2) easier but not very reliable: using a regular expression to extract the characters between the <a> and </a> strings. e.g.:

String s = "<a>3</a>";
Pattern p = Pattern.compile("<a>(.*?)</a>")
Matcher m = p.matcher(s);
if (m.find()) {
  System.out.println(m.group(1)); // => "3"
}
maerics
  • 151,642
  • 46
  • 269
  • 291
6

Jsoup will do this easily.

String title = Jsoup.parse("<a>3</a>").select("a").first().text();
Peter C
  • 6,219
  • 1
  • 25
  • 37
2

You can use regex:

try {
    Pattern regex = Pattern.compile("<a>(.*)</a>");
    Matcher regexMatcher = regex.matcher(subjectString);
    while (regexMatcher.find()) {
        for (int i = 1; i <= regexMatcher.groupCount(); i++) {
            // matched text: regexMatcher.group(i)
            // match start: regexMatcher.start(i)
            // match end: regexMatcher.end(i)
        }
    } 
} catch (PatternSyntaxException ex) {
    // Syntax error in the regular expression
}

But, if your input is HTML, you should really consider using an HTML parser.

Nithin Philips
  • 331
  • 1
  • 6