0

How could I use HTMLUnit to get data from a webpage generated by a java servlet. I keep getting an error when I try to read the webpage. /getSurvey is the servlet that creates the webpage but how can I access the HTML generated from the servlet.

final WebClient webClient = new WebClient(); final HtmlPage page = webClient.getPage("http://survey-creator.appspot.com/getSurvey");

Delanoy
  • 31
  • 7
  • And the error you get is ... ? – matt b Mar 25 '11 at 18:18
  • 1
    The fact that the page is generated by the servlet is meaningless for HtmlUnit. It's a programmatic web browser accessing web resources like any other browser. – JB Nizet Mar 25 '11 at 18:23

1 Answers1

0

HtmlUnit is not really "just" a HTML parser. It's kind of a programmatic webbrowser. It's intented to surf through web pages and/or fill out web forms programmatically using Java language. If your sole purpose is to get the HTML as a String, use a real HTML parser. I can recommend Jsoup for this.

String html = Jsoup.connect("http://stackoverflow.com").get().html();

That's it. It can however do much more than that, such as selecting elements of interest.

See also:

Community
  • 1
  • 1
BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555