14

I want to see if the current tab is a PDF file from a background page.

I can check the url for .pdf at the end but there are some PDF files that don't have that.

Orny
  • 665
  • 6
  • 16
  • Did you ever develop an extension with this functionality? I would love to have such an extension, but don't want to learn how to code one for a few personal uses. – msbg Mar 17 '13 at 15:50

5 Answers5

12

Issuing a new request just to get the MIME type is a bit heavy, and not reliable. For instance, if the currently displayed page is the result of a POST form submission, then issuing a GET request will usually not lead to the same page.

If you're developing an extension that frequently needs access to this information, use the chrome.webRequest API to track the responses. The following demo extension shows the content type upon click of the browser button:

// background.js
var tabToMimeType = {};
chrome.webRequest.onHeadersReceived.addListener(function(details) {
    if (details.tabId !== -1) {
        var header = getHeaderFromHeaders(details.responseHeaders, 'content-type');
        // If the header is set, use its value. Otherwise, use undefined.
        tabToMimeType[details.tabId] = header && header.value.split(';', 1)[0];
    }
}, {
    urls: ['*://*/*'],
    types: ['main_frame']
}, ['responseHeaders']);

chrome.browserAction.onClicked.addListener(function(tab) {
    alert('Tab with URL ' + tab.url + ' has MIME-type ' + tabToMimeType[tab.id]);
});

function getHeaderFromHeaders(headers, headerName) {
    for (var i = 0; i < headers.length; ++i) {
        var header = headers[i];
        if (header.name.toLowerCase() === headerName) {
            return header;
        }
    }
}

Notes:

  • This extension only shows the result for tabs which are loaded after the extension is loaded.
  • This only works on http/https pages. ftp:, file:, filesystem:, blob:, data: is not supported.
  • When no MIME-type is specified by the server or when the MIME-type is text/plain, Chrome falls back to MIME sniffing unless the X-Content-Type-Options: nosniff is set. In the first case, the detected MIME-type could be anything. In the latter case, the default MIME-type is text/plain.

For completeness, here is a manifest.json file that can be used to test the previous code:

{
    "name": "Click button to see MIME",
    "version": "1",
    "manifest_version": 2,
    "background": {
        "scripts": ["background.js"],
        "persistent": true
    },
    "browser_action": {
        "default_title": "Show MIME"
    },
    "permissions": [
        "webRequest",
        "activeTab",
        "*://*/*"
    ]
}
Rob W
  • 341,306
  • 83
  • 791
  • 678
  • Your answer is very detailed and helpful. Thanks! – Ivan P Nov 21 '14 at 04:12
  • 2
    Really helpful. This is way better than the accepted answer - no extensions should be re-requesting the header in practice. – Kevin Qi Feb 08 '15 at 03:41
  • Nice answer, but unfortunately webRequest requires to set `"persistent": true`, which prevents using the now preferred event pages. The equivalent API for event pages, declarativeWebRequest, is still in beta and actually seems to be completely on hold at this point. – Pyves Mar 24 '18 at 10:40
4

You can't get it using current Chrome API afaik. What you can do is load this page again through XHR and check returned content-type header. Something like this:

background html:

chrome.tabs.onUpdated.addListener(function(tabId, changeInfo, tab) {
    if(changeInfo.status == "loading") {
        if(checkIfUrlHasPdfExtension(tab.url)) {
            //.pdf
            pdfDetected(tab);
        } else {
             var xhr = new XMLHttpRequest();
             xhr.open("GET", tab.url, true);
             xhr.onreadystatechange = function() {
               if (xhr.readyState == 4) {
                 var contentType = xhr.getResponseHeader("Content-Type");
                 if(checkIfContentTypeIsPdf(contentType)) {
                    pdfDetected(tab);
                 }
               }
             }
             xhr.send();
        }
    }
});

manifest.json:

"permissions": [
    "tabs", "http://*/*", "https://*/*"
]

For PDF files returned content type should be application/pdf. Something to keep in mind though is that content-type header could contain encoding as well: text/html; charset=UTF-8.

serg
  • 109,619
  • 77
  • 317
  • 330
  • 1
    Thanks I believe that would work. But I'm afraid I won't be using it because every page would be loaded twice. – Orny Feb 08 '11 at 12:14
  • @Orny I agree, I would just check for pdf extension, should be enough for 99% cases – serg Feb 08 '11 at 16:25
  • I was searching for something like this, and because I'll use it only when opening the extension popup, I think (hope) that the request will use the cached page in most cases. – Omiod May 15 '11 at 07:29
  • @serg How to propagate type to `content_scripts`-files in the `manifest.js`? – Peter Rader Oct 05 '17 at 10:34
2

You can evaluate the property document.contentType on the current tab. Here is an example on browserAction :

chrome.browserAction.onClicked.addListener(() => {
    chrome.tabs.getSelected((tab) => {
        chrome.tabs.executeScript(tab.id, { code: 'document.contentType' }, ([ mimeType ]) => {
            alert(mimeType);
        });
    })
});

This property returns the MIME type that the document is being rendered as, not the Content-Type header (no information about the charset).

leizh
  • 21
  • 2
1

A somewhat hackish way (I have no idea if it works always or just sometimes) is to look at the page content. There you will find an element for chrome's PDF viewer. It looks along these lines:

<embed width="100%" height="100%" name="plugin" src="https://example.com/document.pdf" type="application/pdf">

You can check that "type" attribute to see what you are dealing with.

Ivan P
  • 1,920
  • 2
  • 15
  • 19
  • 1
    This really did the trick for me, thanks a lot! It's somewhat hackish indeed but it seems to be the only way that works also for "file://" URLs (provided that the manifest.json declares that injected scripts should go into URLs that match a "file://*" selector). Here's the code I used in the injected script: `if (document.body.childElementCount === 1) { var embed = document.body.firstElementChild; if (embed.tagName === "EMBED" && embed.getAttribute("type") === "application/pdf") { /* do something */ } }` – robamler Nov 07 '15 at 10:42
0

I had to do something similar in one of my extensions and did something very similar to the answer given by @serg but using a HEAD request instead. In theory, a HEAD request should be identical to a GET request but without sending the response body, which in the case of an image or file could be quite a bit of extra data and time waiting.

I also split and shift the header to drop any charsets that might be appended on the content-type.

getContentType: function(tab, callback){
    var xhr = new XMLHttpRequest();
    xhr.open("HEAD", tab.url, false);
    xhr.onload =  function(e) {
        if (xhr.readyState === 4) {
            if(xhr.status === 200) {
                callback(xhr.getResponseHeader("Content-Type").split(";").shift());
            }
            else{
                callback('Unknown');
                console.error(xhr.statusText);
                return;
            }
        }
    };

    xhr.onerror = function (e) {
        console.error(xhr.statusText);
        return;
    };

    xhr.send();
}
Community
  • 1
  • 1
Adam42
  • 11
  • 2