0

Using Java, I need to save a complete webpage(with all its contents like images, css, javascript e.t.c) like how we can do with save as-->complete webpage option with HttpClient lib. How can I do this?

dbc
  • 104,963
  • 20
  • 228
  • 340
  • Related: [Fetch complete web page using java code](https://stackoverflow.com/q/10119998), [How can I download web page with dependencies in Java?](https://stackoverflow.com/q/4318616), [download a complete web page including resources (like images) in java](https://stackoverflow.com/q/4359060), [download webpage and dependencies, including css images](https://stackoverflow.com/q/1581551). – dbc Apr 25 '21 at 13:38

3 Answers3

0

You can try lib curl java http://curl.haxx.se/libcurl/java/

And you can refer to this discussion also curl-equivalent-in-java

Community
  • 1
  • 1
Gerard Banasig
  • 1,703
  • 13
  • 20
0

You have to write an application that fetches the html file, parses it and extracts all the references, and then fetches all the files found by parsing.

stepanian
  • 11,373
  • 8
  • 43
  • 63
0

It's not so easy because some CSS/JS/Images files paths might be "hidden". Just consider the following example:

<script type="...">
   document.write("&bla;script" + " type='...' src='" + blahBlah() + "'&bla;" + "&bla;/script&bla;");
</script>

However, fetching page source, parsing in the search for URLs and downloading founded URLs is pretty everything you'll probably need.

Crozin
  • 43,890
  • 13
  • 88
  • 135