_ Hi , this is my web page :
<html>
<head>
</head>
<body>
<div> text div 1</div>
<div>
<span>text of first span </span>
<span>text of second span </span>
</div>
<div> text div 3 </div>
</body>
</html>
I'm using jsoup to parse it , and then browse all elements inside the page and get their paths :
Document doc = Jsoup.parse(new File("C:\\Users\\HC\\Desktop\\dataset\\index.html"), "UTF-8");
Elements elements = doc.body().select("*");
ArrayList all = new ArrayList();
for (Element element : elements) {
if (!element.ownText().isEmpty()) {
StringBuilder path = new StringBuilder(element.nodeName());
String value = element.ownText();
Elements p_el = element.parents();
for (Element el : p_el) {
path.insert(0, el.nodeName() + '/');
}
all.add(path + " = " + value + "\n");
System.out.println(path +" = "+ value);
}
}
return all;
my code give me this result :
html/body/div = text div 1
html/body/div/span = text of first span
html/body/div/span = text of second span
html/body/div = text div 3
in fact i want get result like this :
html/body/div[1] = text div 1
html/body/div[2]/span[1] = text of first span
html/body/div[2]/span[2] = text of second span
html/body/div[3] = text div 3
please could any one give me idea how to get reach this result :) . thanks in advance.