As I understand, you are searching for the letters ID
, followed by the character >
, followed by precisely five digits and finally followed by the characters </
.
You can achieve this with the following regular expression:
ID>\d{5}</
where ID>
is a literal string and \d
means a single digit and {5}
means the preceding expression five times. Since the preceding expression is \d
, then \d{5}
means five digits. Finally </
is also a literal string.
Since you want to extract only the digits, you should group them by enclosing \d{5}
in brackets. Hence the regular expression you require is:
ID>(\d{5})</
Here is the java code. Note that since the character \
is the "escape" character you need to write it twice in the regular expression.
public class MyClass {
public static void main(String args[]) {
// Tests
System.out.println(getId("<StudentID>12345</StudentID>"));
System.out.println(getId("<ID>12345</ID>"));
System.out.println(getId("<Somedata>SSS<Somedata><StudentID>12345</StudentID><Name>MMM</Name>"));
System.out.println(getId("<Somedata>SSS<Somedata><ID>12345</ID><Name>MMM</Name>"));
}
static String getId(String s) {
java.util.regex.Pattern pattern = java.util.regex.Pattern.compile("ID>(\\d{5})</");
java.util.regex.Matcher matcher = pattern.matcher(s);
String id = "";
if (matcher.find()) {
id = matcher.group(1);
}
return id;
}
}
Refer to the following:
Java tutorial on regular expressions
The Web site Regular Expressions.info
You can also experiment with regular expressions online at regex 101