7

I want to save a wabpage completely from my Google Chrome extension. I added "downloads", "<all_urls>" permissions and confirmed that the following code save the Google page to google.html.

  chrome.downloads.download(
            { url: "http://www.google.com",
              filename: "google.html" },
            function (x) { console.log(x); })

However, this code only saves the html file. Stylesheets, scripts and images are not be saved. I want to save the webpage completely, as if I save the page with the dialog, selecting Format: Webpage, Complete.

I looked into the document but I couldn't find a way.

So my question is: how can I download a webpage completely from an extension using the api(s) of Google Chrome?

itchyny
  • 824
  • 1
  • 9
  • 15

2 Answers2

10

The downloads API downloads a single resource only. If you want to save a complete web page, then you can first open the web page, then export it as MHTML using chrome.pageCapture.saveAsMHTML, create a blob:-URL for the exported Blob using URL.createObjectURL and finally save this URL using the chrome.downloads.download API.

The pageCapture API requires a valid tabId. For instance:

// Create new tab, wait until it is loaded and save the page
chrome.tabs.create({
    url: 'http://example.com'
}, function(tab) {
    chrome.tabs.onUpdated.addListener(function func(tabId, changeInfo) {
        if (tabId == tab.id && changeInfo.status == 'complete') {
            chrome.tabs.onUpdated.removeListener(func);
            savePage(tabId);
        }
    });
});

function savePage(tabId) {
    chrome.pageCapture.saveAsMHTML({
        tabId: tabId
    }, function(blob) {
        var url = URL.createObjectURL(blob);
        // Optional: chrome.tabs.remove(tabId); // to close the tab
        chrome.downloads.download({
            url: url,
            filename: 'whatever.mhtml'
        });
    });
}

To try out, put the previous code in background.js,
add the permissions to manifest.json (as shown below) and reload the extension. Then example.com will be opened, and the web page will be saved as a self-contained MHTML file.

{
    "name": "Save full web page",
    "version": "1",
    "manifest_version": 2,
    "background": {
        "scripts": ["background.js"]
    },
    "permissions": [
        "pageCapture",
        "downloads"
    ]
}
Community
  • 1
  • 1
Rob W
  • 341,306
  • 83
  • 791
  • 678
  • Thank you for your informative comment. I tried your code and it actually works. Yet the saved file was not in the form I firstly expected, but using mhtml format sounds like a nice idea. Thank you. – itchyny Jul 10 '14 at 10:57
  • Sorry, I'm not used to stackoverflow. Thanks again. – itchyny Jul 10 '14 at 11:18
  • @rob-w I'm using the `manifest_version=2` in my app. I believe this is required for the chrome *apps*. But the downloads api isn't working on chrome apps. How's that working for you? – user3677331 Jun 23 '15 at 08:27
  • @user3677331 `chrome.downloads` is only available to extensions, not apps. See https://code.google.com/p/chromium/issues/detail?id=274673 – Rob W Jun 23 '15 at 08:46
  • `Error handling response: TypeError: URL.createObjectURL is not a function` Were there some security changes in Chrome? – Anton Dec 10 '22 at 21:49
-1

No, it does not download for you all files: images, js, css etc. You should use tools like HTTRACK.

Claudiu Creanga
  • 8,031
  • 10
  • 71
  • 110