5

In my Chrome extension I am injecting the content script into all IFRAMEs inside a page. Here's a part of the manifest.json file:

"content_scripts": [
    {
        "run_at": "document_end",
        "all_frames" : true,
        "match_about_blank": true,
        "matches": ["http://*/*", "https://*/*"],
        "js": ["content.js"]
    }
],

So a single web page having multiple IFRAMEs will end up running that many copies of my injected content.js.

The logic inside content.js collects data from each IFRAME it's injected into, or from the main/top page, and sends it back to the background script (using chrome.runtime.sendMessage.) The background script in turn needs to store the data in the global variable, that is later used in the extension itself.

The issue I'm facing is that the app needs to distinguish between the "data" received from multiple IFRAMEs, since my data collection method can be called repeatedly upon user's interaction with the page, and thus I cannot simply "dump" the data received by the background script into an array. Instead I need to use a dictionary-type data storage.

I can tell if the data is coming from an IFRAME or from the top page by running the following:

//From the `content.js`
var isIframe = window != window.top;

and my thinking was that if I collect page URLs of each IFRAME then I should be able to use it as a unique key to store the data under in my dictionary-type global variable:

//Again from content.js
var strUniqueIFrameURL = document.URL;

Well, that is not going to work, because two or more IFRAMEs can have the same URLs.

So thus my original question -- how to tell IFRAMEs on the page apart? Is there some unique ID or somethign that Chrome assigns to them?

c00000fd
  • 20,994
  • 29
  • 177
  • 400
  • To reformulate your question. Say there is a page with 3 ` – Xan Sep 24 '14 at 10:10
  • @Xan: Yes, you are correct. Except that `iframes` can be both same- and cross-origin. (I am not the author of the pages that my content script is injected into, so I'm assuming both scenarios.) – c00000fd Sep 24 '14 at 10:14
  • Without cross-origin problems, it's easily solved with `window.frameElement` and `window.parent.document.querySelectorAll("iframe")`. But you can't access those with cross-origin iframes. – Xan Sep 24 '14 at 10:15
  • Wait! It's solvable! Since the content script CAN access `window.parent`, unlike frame's own code. – Xan Sep 24 '14 at 10:22
  • @Xan: Hmm. Thanks. I'll have to leave it off till morning. It's too late now... can't think straight. Will let you know if it worked. – c00000fd Sep 24 '14 at 10:34
  • Just posted an answer, see if it works for you. – Xan Sep 24 '14 at 10:35

3 Answers3

8

You can identify the relative place of the document in the hierarchy of iframes. Depending on the structure of the page, this can solve your problem.

Your extension is able to access window.parent and its frames. This should work, or at least works for me in a test case:

// Returns the index of the iframe in the parent document,
//  or -1 if we are the topmost document
function iframeIndex(win) {
  win = win || window; // Assume self by default
  if (win.parent != win) {
    for (var i = 0; i < win.parent.frames.length; i++) {
      if (win.parent.frames[i] == win) { return i; }
    }
    throw Error("In a frame, but could not find myself");
  } else {
    return -1;
  }
}

You can modify this to support nesting iframes, but the principle should work.

I was itching to do it myself, so here you go:

// Returns a unique index in iframe hierarchy, or empty string if topmost
function iframeFullIndex(win) {
   win = win || window; // Assume self by default
   if (iframeIndex(win) < 0) {
     return "";
   } else {
     return iframeFullIndex(win.parent) + "." + iframeIndex(win);
   }
}
Xan
  • 74,770
  • 16
  • 179
  • 206
  • Just from the quick test, it did work for the page I was "struggling" with. Thanks! I'll try to modify it for nested iframes though. I think your method needs some "recursion". – c00000fd Sep 24 '14 at 10:43
  • @c00000fd Awesome! If you do improve the code, please share is as another answer. – Xan Sep 24 '14 at 10:45
  • @c00000fd Eh, did it myself. With some "recursion". – Xan Sep 24 '14 at 10:58
  • Sorry, I'm in a different time zone than you. I appreciate your work! I had to change your recursive method though. – c00000fd Sep 24 '14 at 20:19
  • At first I was surprised this does not trigger CORS. We are allowed to access parents of cross-origin window documents recursively. It seems to me CORS is not triggered because we are not accessing any properties on those parent window objects, except from `frames`. Do you know if this is the right interpretation @Xan? – Gaurang Tandon Jul 07 '22 at 09:12
  • @GaurangTandon Chrome extension code works around many Web security mechanisms, including things like CORS. I doubt non-content-script code in the parent frame could do this. – Xan Jul 07 '22 at 09:46
1

Just to expand on @Xan's answer, here's my method of getting an IFRAME's index considering its possible nesting within other IFRAMEs. I'll use the forward-iframe notation, meaning that the parent IFRAME index will be given first, followed by child indexes, etc. Also to prevent a possible confusion with floating-point numbers I'll use the underscore for a separator instead of the dot.

So to answer my original question, once I have IFRAME index within the page, it will uniquely identify it in that page (coupled with the IFRAME's URL.)

Here's the code to get it:

function iframeIndex(wnd)
{
    //RETURN:
    //      = "" for top window
    //      = IFrame zero-based index with nesting, example: "2", or "0_4"
    //      = "?" if error
    return _iframeIndex(wnd || window);     // Assume self by default
}

function _iframeIndex(wnd)
{
    var resInd = "";

    var wndTop = window.top;

    if(wnd == wndTop)
        return resInd;

    var wndPar = wnd.parent;

    if(wndPar != wndTop)
    {
        resInd = _iframeIndex(wndPar) + "_";
    }

    var frmsPar = wndPar.frames;
    for(var i = 0; i < frmsPar.length; i++)
    {
        if(frmsPar[i] == wnd)
            return resInd + i;
        }

    return resInd + "?";
}
Community
  • 1
  • 1
c00000fd
  • 20,994
  • 29
  • 177
  • 400
0

You can generate a pseudo-unique id using a combination of timestamp and a random number every time a content script loads, like this:

var psUid = (new Date()).getTime() + '_' + Math.random();

And then send all your data-related messages to the background with this ID.

MeLight
  • 5,454
  • 4
  • 43
  • 67
  • Thanks. But such ID must remain the same per IFRAME. – c00000fd Sep 24 '14 at 08:23
  • What do you "remain" the same? Even after the iframe refreshes? – MeLight Sep 24 '14 at 08:29
  • Yes. They should remain the same if page reloads. Since the background script does not change after page reloads, if I start making up IFRAME ids like you showed, when I store their data in my dictionary it will create duplicate records every time the page reloads. – c00000fd Sep 24 '14 at 08:39
  • What if the page that contains the iframe refreshes? You need to keep the id's persistant through those refreshes also? – MeLight Sep 24 '14 at 08:45
  • So let me get this straight: you have two iframes (`x` and `y`), with the same URL, each should write data to `VAR_X` and `VAR_Y` in the background. When you refresh each of the iframes they land on the same URL (which means they load the same data?) but you still need to tie them to their old variables? – MeLight Sep 24 '14 at 09:00
  • Yes, they should be stored under two different keys in the dictionary although both iframes have the same `document.URL`. Such keys should remain the same after page reloads though. – c00000fd Sep 24 '14 at 09:07
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/61829/discussion-between-melight-and-c00000fd). – MeLight Sep 24 '14 at 09:10