I have very long html string which has multiple
<dl id="divmap"> .... </dl>.
I want to remove all content between this .
i wrote this code in java:
String triphtml= htmlString;
System.out.println("triphtml is "+triphtml);
System.out.println("test1 ");
final Pattern pattern = Pattern.compile("(<dl id=\""+selectedArray[i]+"\">)(.+?)(</dl>)",
Pattern.DOTALL);
final Matcher matcher = pattern.matcher(triphtml);
// matcher.find();
System.out.println("pattern of test1 is : "
+ pattern); // Prints
System.out.println("MATCHER of test1 is : "
+ matcher); // Prints
System.out.println("MATCH COUNT of test1 a: "
+ matcher.groupCount()); // Prints
System.out.println("MATCH COUNT of test1 a: "
+ matcher.find()); // Prints
while (matcher.find()) {
// System.out.println("MATCH GP 3: "+matcher.group(3).substring(1,10));
for (int z = 0; z <= matcher.groupCount(); z++) {
String extstr = matcher.group(z);
System.out.println("matcher group of "+z+" test1 is " + extstr);
System.out.println("ext a of test1 is " + extstr);
triphtml = triphtml.replaceAll(extstr, "");
System.out.println("Group found of test1 is :\n" + extstr);
}
}
But this code removes some dl and some remains in triphtml. I dont why this thing is happening. Here triphtml is a html string which has multiple dl's. Please help me how I remove content between all
<dl id="divmap">.
Thanks in advance.