I would like to read the content of a website into a string.
I started by using jsoup
as follows:
private void getWebsite() {
new Thread(new Runnable() {
@Override
public void run() {
final StringBuilder builder = new StringBuilder();
try {
String query = "https://merhav.nli.org.il/primo-explore/search?tab=default_tab&search_scope=Local&vid=NLI&lang=iw_IL&query=any,contains,הארי פוטר";
Document doc = Jsoup.connect(query).get();
String title = doc.title();
Elements links = doc.select("div");
builder.append(title).append("\n");
for (Element link : links) {
builder.append("\n").append("Link : ").append(link.attr("href"))
.append("\n").append("Text : ").append(link.text());
}
} catch (IOException e) {
builder.append("Error : ").append(e.getMessage()).append("\n");
}
runOnUiThread(new Runnable() {
@Override
public void run() {
tv_result.setText(builder.toString());
}
});
}
}).start();
}
However, the problem is that in this site, when I web browser such as chrome it says in one of it lines:
window.appPerformance.timeStamps['index.html']= Date.now();</script><primo-explore><noscript>JavaScript must be enabled to use the system</noscript><style>.init-message {
So I read that jsoup
doesn't have a good solution for this case.
Is there any good way to get the element of this page even though that it uses javascript?
EDIT:
After trying the suggestions below, I used webView to load the url and then parsed it using jsoap as follows:
wb_result.getSettings().setJavaScriptEnabled(true);
MyJavaScriptInterface jInterface = new MyJavaScriptInterface();
wb_result.addJavascriptInterface(jInterface, "HtmlViewer");
wb_result.setWebViewClient(new WebViewClient() {
@Override
public void onPageFinished(WebView view, String url) {
wb_result.loadUrl("javascript:window.HtmlViewer.showHTML ('<head>'+document.getElementsByTagName('html')[0].innerHTML+'</head>');");
}
});
It did the job and indeed showed me the element. However, still, unlike a browser, it shows some lines as a function and not as a result. For example:
ng-href="{{::$ctrl.getDeepLinkPath()}}"
Is there a way to parse and display the result like in the browser?
Thank you