I have below HTML which needs to be parsed recursively. I am using Jericho html parser lib for this. However not able to achieve the recursion. Pointers are appreciated!
HTML
<div wicket:id="Container1">
<div wicket:id="Panel1"></div>
<form wicket:id="sampleForm" action="">
<div id="InterstitialPanel" class="usaa-interstitial s1"></div>
<div wicket:id="DisclosureSection">
<div class="sample class">
<ul>
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
<li>Item 4</li>
</ul>
</div>
</div>
</form>
</div>
<div wicket:id="Container2">
<h2>This is container 2</h2>
</div>
Expected O/p
Here, I am fetching a value of wicket:id
from original html and creating a new tag out of it. If it is a standard HTML tag (without wicket attributes), keep it as is.
<Container1>
<Panel1></Panel1>
<Form id="sampleForm">
<InterstitialPanel></InterstitialPanel>
<DisclosureSection>
<div class="sample class">
<ul>
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
<li>Item 4</li>
</ul>
</div>
</DisclosureSection>
</Form>
This is container 2
Java Code
public static void main(String args[]){
Main m = new Main();
File file = new File("C:\\USAAPoC\\ClientSamplePage.html");
Source doc = new Source(file);
Element element = doc.getElementByTag("body");
m.processBody(element, new ReactPageContext());
}
private StringBuilder processBody(Element element, ReactPageContext reactPageContext) {
StringBuilder sb = new StringBuilder();
if (!element.getChildElements().isEmpty()) {
element.getChildElements().stream().forEach(child -> sb.append(processTag(child, reactPageContext)));
} else {
sb.append("<Container>\n");
}
return sb;
}
private StringBuilder processTag(Element element, ReactPageContext reactPageContext) {
StringBuilder sb = new StringBuilder();
element.getAllElements().stream().forEach(child -> {
if (child.getName().equals("div") && child.getAttributes().get("wicket:id") != null) {
sb.append("<Container>");
} else if (child.getName().equals("form")) {
sb.append("<Form>"
List<Element> allChildFormElements = outputDocument.getSegment().getAllElements();
allChildFormElements.stream().forEach(childFormElement -> {
if (childFormElement.getName().equals("div")
&& childFormElement.getAttributes().get("wicket:id") != null) {
sb.append("<Container>");
} else {
//logic
}
});
}
});
return sb;
}```