The <br>
tag is an empty tag which means that it has no end tag.
See: http://www.w3schools.com/tags/tag_br.asp
Replacing your </br>
tag with <br>
(if you print the jsoup document you will see, that jsoup fixes such mistakes automatically) your <td>
tag has four childnodes:
So the text SCH4UE-01 : Chemistry
is the first childnode (element.childNode(0)
).
Code
String htmlString = "<html><body><table><td> SCH4UE-01 : Chemistry <br> Block: 1 - rm. 315 <br></td></table></body></html>";
Document doc = Jsoup.parse(htmlString);
Elements tdElements = doc.select("td");
for (Element tdElement : tdElements){
System.out.println(tdElement.childNode(0));
}
Output
SCH4UE-01 : Chemistry