1

I want to download the html of a web page containing some javascript. If I use a library like jsoup, I obtain the html without the elements generated by the javascript.

How can I take the html obtained after the javascript execution?

edit: How can I use the script in the answer in Java program?

Ortomala Lokni
  • 56,620
  • 24
  • 188
  • 240
Frank Cunningham
  • 178
  • 1
  • 12
  • 2
    If you want to *execute* (not just read) the JavaScript on a web site, you need to run [a headless Web browser](https://gist.github.com/evandrix/3694955) instead of using an HTML parser. – Jacob Budin Jan 03 '15 at 16:10

1 Answers1

3

You can use PhantomJS with the following script:

var page = require('webpage').create();
page.open('http://stackoverflow.com',function(status){
  if(status !== 'success'){
    console.log('Open failed');
  }else{
   console.log(page.evaluate(function(){
                               return document.documentElement.outerHTML;
                             }));
  }
  phantom.exit();
});

If you want execute this script from Java read the following :

Running Phantomjs from javascript, JSP or Java

Community
  • 1
  • 1
Ortomala Lokni
  • 56,620
  • 24
  • 188
  • 240