0

In my chrome extension, I'm scraping selected pages, using an iframe controlled by the extension. If the user wants to scrape their gmail inbox, the request fails with:

Refused to display 'https://mail.google.com/' in a frame because it set 'X-Frame-Options' to 'sameorigin'.

However, if they wish to scrape their google drive, it works.

Both drive and gmail set: X-Frame-Options: SAMEORIGIN.

In my code, I am pre-empting that by use of declarativeNetRequest to remove this header (and various others).

        chrome.declarativeNetRequest.updateSessionRules({
            removeRuleIds: [1001],
            addRules: this.createRulesFor(patterns, [1001], tabId),
        })

    ...
    
    private createRulesFor(patterns: string[], ids: number[], tabId: number): Rule[] {
        let rules: Rule[] = [];

        rules.push({
            id: ids[0],
            priority: 1,
            action: {
                type: RuleActionType.MODIFY_HEADERS,
                responseHeaders : [
                    ...this.createHeaderToRemove("X-Frame-Options"),
                    ...this.createHeaderToRemove("Content-Security-Policy"),
                    ...this.createHeaderToRemove("Content-Security-Policy-Report-Only"),
                    ...this.createHeaderToRemove("Cross-Origin-Embedder-Policy"),
                    ...this.createHeaderToRemove("Cross-Origin-Owner-Policy"),
                    ...this.createHeaderToRemove("X-Content-Type-Options"),
                    ...this.createHeaderToRemove("X-Xss-Protection"),
                ]
            },
            // XXX doesn't handle HTTP2 / 999, whatever that is (e.g. linkedin)
            condition: {
                // urlFilter: patterns[i],
                tabIds: [ tabId ]
            }
        });
        return rules
    }
    ...
    private createHeaderToRemove(headerName: string) {
        const lowercaseHeader = headerName.toLowerCase();
        const capitalizedHeader = headerName.charAt(0).toUpperCase() + lowercaseHeader.slice(1);
        return [
            {
                header: headerName,
                operation: HeaderOperation.REMOVE
            },
            {
                header: lowercaseHeader,
                operation: HeaderOperation.REMOVE
            },
            {
                header: capitalizedHeader,
                operation: HeaderOperation.REMOVE
            }
        ];
    }

(In case you're wondering about the patterns - I was originally injecting wildcarded urls based on what I was about to scrape, but then I realized I could just use the tabId instead.)

When I attach a chrome.declarativeNetRequest.onRuleMatchedDebug, it shows that the rules match for google drive, but do not match for gmail.

One theory is that the two apps behave differently when the base url is "copy as curl"'ed from chrome and then run with curl; gmail returns a series of 3 or 4 302s before eventually a 200 (when you manually plug back in the location), whereas drive is instantly a 200. If this explains the issue then:

a) Why is the error message specifically about X-Frame-Options? b) How do I instrument my iframe (in a tab, controlled by the chrome extension) to be able to observe and follow the 302?

user717847
  • 695
  • 1
  • 6
  • 16
  • Open `chrome://policy` and see if there's ExtensionSettings with runtime_blocked_hosts for gmail inside. – wOxxOm May 29 '23 at 04:13
  • Nope - I don't see any. – user717847 May 29 '23 at 11:16
  • You should also unregister its service worker and clear the cache, [example](/a/69177790). – wOxxOm May 29 '23 at 11:18
  • That sounds plausible - but clearing the cache/service worker sounds like a disaster for usability. I want this to be able to run without interfering with the user's regular browsing. Clearing gmail's cache would be rather severe. :( – user717847 May 31 '23 at 22:00
  • This is how the API works. Note that cache means the worker's cache and not the web cache. Usually there should be no difference for the user other than possibly redownloading some of the worker's assets. – wOxxOm Jun 01 '23 at 05:23
  • That worked! Though now I ran into their frame busting code :( – user717847 Jun 02 '23 at 03:23
  • If you want to post it as an answer I'll accept it. Thanks – user717847 Jun 02 '23 at 03:23
  • This is already answered in the example I've linked, so I'd rather just mark this as a duplicate... – wOxxOm Jun 02 '23 at 05:06

0 Answers0