1

In an Android WebView, I would like to display websites and download the full HTML with all images from those websites to the device's storage.

As the WebView downloads all assets of a page in order to be able to display it, it would be redundant work if I loaded the page in the WebView and afterwards download the HTML and images again, right?

So I thought you could maybe access the contents that the WebView downloaded and just copy them to the device's storage. Is this possible?

According to this page, you can set up JavaScript interfaces and then call some JavaScript statements like this:

webView.loadUrl("javascript:doSomething()");

So I could get the page's HTML if I just used JavaScript's document.innerHTML or document.getElementById('theID').innerHTML.

And for the images, is there an easier solution than to use JavaScript? The problem is that I don't just want the URLs but the loaded resources. The WebView did already download all assets, so the question is if it exposes access to them in some way.

As described in this question, it seems to be possible to get images using context menu events. (Maybe even background images.) But is there a solution that does not require user actions and saves all images in batch, preferably without downloading them once more?

Community
  • 1
  • 1
caw
  • 30,999
  • 61
  • 181
  • 291

1 Answers1

4

In short: no. There's no API to access the downloaded resources from the WebView from Java. You'll need to roll your own solution.

In the case of images, the only way I can think of to avoid re-downloading the resource would be to write some javascript that copied the image once it was loaded into a <canvas> element and then read the bytes back as a data URL. You'd then be able to recreate the image on the Java side from that data URL (by passing it to Java via a JavaScript interface in your WebView) and save it to disk or work with it otherwise.

These links would probably be helpful to you:

ksasq
  • 4,424
  • 2
  • 18
  • 10