2

I'm using HtmlUnit to get a html page for parsing. The page url is as follow: http://detail.tmall.com/item.htm?id=16158055933&ad_id=&am_id=&cm_id=140105335569ed55e27b&pm_id=

But I can't get the price 28.00 all the time. I've tried these methods:

  1. Thread.sleep();

  2. webClient.waitforbackgroundJavascript();

I've also tried to execute the very javascript manually, but I couldn't find which javascript is responsible for showing the price data.

My code is such as:

webClient = new WebClient();
// webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setActiveXNative(true);
webClient.getOptions().setAppletEnabled(false);
webClient.getOptions().setCssEnabled(true);
webClient.getOptions().setDoNotTrackEnabled(true);
webClient.getOptions().setGeolocationEnabled(false);
webClient.getOptions().setPopupBlockerEnabled(false);
webClient.getOptions().setPrintContentOnFailingStatusCode(true);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(true);
webClient.getOptions().setThrowExceptionOnScriptError(true);
webClient.setAjaxController(new NicelyResynchronizingAjaxController())

HtmlPage page = null;
    try {
        page = webClient.getPage(url);
    } catch (FailingHttpStatusCodeException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (MalformedURLException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    // content = page.asXml();

    content = page.getWebResponse().getContentAsString();
    webClient.closeAllWindows();
Kara
  • 6,115
  • 16
  • 50
  • 57

0 Answers0