I came up with something like this which didn't work out. I am trying to extract the texts that contain the keyword alone and not the entire text of the webpage just because the webpage has that keyword.
String pconcat="";
for (i = 0; i < urls.length; i++) {
Document doc=Jsoup.connect(urls[i]).ignoreContentType(true).timeout(60*1000).get();
for(int x=0;x<keyWords.length;x++){
if(doc.body().text().toLowerCase().contains(keyWords[x].toLowerCase())){
Elements e=doc.select("body:contains("+keyWords[x]+")");
for(Element element : e)
{
pconcat+=element.text();
System.out.println("pconcat"+pconcat);
}
}
}
}
Consider example.com , if the keyword I look for is "documents" , I need the output as "This domain is established to be used for illustrative examples in documents." and nothing else