I have an large String which contains some XML. This XML contains input like:
<xyz1>...</xyz1>
<hello>text between strings #1</hello>
<xyz2>...</xyz2>
<hello>text between strings #2</hello>
<xyz3>...</xyz3>
I want to get all these <hello>text between strings</hello>
.
So in the end I want to have a List or any Collection which contains all <hello>...</hello>
I tried it with Regex and Matcher but the problem is it doesn't work with large strings.... if I try it with smaller Strings, it works. I read a blogpost about this and this says the Java Regex Broken for Alternation over Large Strings.
Is there any easy and good way to do this?
Edit:
An attempt is...
String pattern1 = "<hello>";
String pattern2 = "</hello>";
List<String> helloList = new ArrayList<String>();
String regexString = Pattern.quote(pattern1) + "(.*?)" + Pattern.quote(pattern2);
Pattern pattern = Pattern.compile(regexString);
Matcher matcher = pattern.matcher(scannerString);
while (matcher.find()) {
String textInBetween = matcher.group(1); // Since (.*?) is capturing group 1
// You can insert match into a List/Collection here
helloList.add(textInBetween);
logger.info("-------------->>>> " + textInBetween);
}