I'm using HtmlUnit to get a html page for parsing. The page url is as follow: http://detail.tmall.com/item.htm?id=16158055933&ad_id=&am_id=&cm_id=140105335569ed55e27b&pm_id=
But I can't get the price 28.00 all the time. I've tried these methods:
Thread.sleep();
webClient.waitforbackgroundJavascript();
I've also tried to execute the very javascript manually, but I couldn't find which javascript is responsible for showing the price data.
My code is such as:
webClient = new WebClient();
// webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setActiveXNative(true);
webClient.getOptions().setAppletEnabled(false);
webClient.getOptions().setCssEnabled(true);
webClient.getOptions().setDoNotTrackEnabled(true);
webClient.getOptions().setGeolocationEnabled(false);
webClient.getOptions().setPopupBlockerEnabled(false);
webClient.getOptions().setPrintContentOnFailingStatusCode(true);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(true);
webClient.getOptions().setThrowExceptionOnScriptError(true);
webClient.setAjaxController(new NicelyResynchronizingAjaxController())
HtmlPage page = null;
try {
page = webClient.getPage(url);
} catch (FailingHttpStatusCodeException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
// content = page.asXml();
content = page.getWebResponse().getContentAsString();
webClient.closeAllWindows();