0

I know how to download a webpage's source in java. But a webpage also contains image url, CSS and JS script url which need to be downloaded later like:

<LINK REL="STYLESHEET" HREF="htmlatex.css">
<img src=p10012.gif>

If I only download the source of a webpage, rendering it in offline mode will need to download this htmlatex.css and p10012.gif result in missing contents in offline mode. My objective is to download all contents of webpage programmatically and provide it as assets of an android app. HOw can I do that in java.

Note: please let me know if my question is not clear enough.

Kaidul
  • 15,409
  • 15
  • 81
  • 150
  • So you want to download somebody's web site and use it as material for your own app. This is at risk of infringing copyright, at the very least. –  Dec 14 '14 at 07:48
  • NO! In my case, no such copyright issue! For example - this is one of the webpage: http://uva.onlinejudge.org/external/100/10012.html and I want to download all contents by designing a robot program and provide in my app. I swear its permissible :) – Kaidul Dec 14 '14 at 07:50
  • You will have to parse the HTML to find URLs for external resources. [Use the search](https://stackoverflow.com/search?q=%5Bjava%5D+parse+html). – Sverri M. Olsen Dec 14 '14 at 08:03
  • I know I have to do such thing if I want to do it manually. But do you know if there exists any library for that? – Kaidul Dec 14 '14 at 08:05

1 Answers1

0

I would suggest to use JSoup library to do it as its pretty good HTML parse. You can parse HTML and than iterate over resources to download them. I am not sure but there should be an example on the same topic you asked.

Sariq Shaikh
  • 940
  • 8
  • 29