3

I've been experimenting with the Chrome extensions mechanism, and been trying to write an extension that would manipulate Google results (add comments, screenshots, favicons, etc.)

So far I've managed to write a code that uses a RegEx to add imgs to a link, and it works ok.

The problem is that it doesn't work on Google results. I read here that it happens because the page hasn't fully loaded; so I added a 'DOMContentLoaded' listener but it didn't help.

Here's my code (content script):

function parse_google()  {
document.body.innerHTML = document.body.innerHTML.replace(
        new RegExp("<a href=\"(.*)\"(.*)</a>", "g"),
        "<img src=\"http://<path-to-img.gif>\" /><a href=\"$1\"$2</a>"
    );
alert("boooya!");
};
alert("content script: before");
document.addEventListener('DOMContentLoaded', parse_google(), false);    
alert("content script: end");

I get all "alerts", but it doesn't work for google. Why?

Community
  • 1
  • 1
Manu
  • 99
  • 8
  • the regex is badly shown, but it works. all it does is add a tag wherever there's a tag. – Manu Aug 20 '12 at 18:15
  • Not completely clear on what the question is since the link you provided would seem to solve your issue. You'll get better answers if you ask a more specific question than the very broad "can you advise". – Mike Grace Aug 20 '12 at 18:38
  • Mike, thanks for the answer. The problem is that I've follow the method demonstrated there, and it hadn't solved the problem. I still can't have my regex affect the google results. Thanks in advanced. – Manu Aug 20 '12 at 18:46
  • Have you tried to listen to DOMNodeInserted ? – Ido Green Aug 21 '12 at 12:20

2 Answers2

2

"DOMContentLoaded" refers to the static HTML of the page, but Google's search results are fetched using AJAX, thus are not there yet when the "DOMContentLoaded" event is triggered.

You could use a MutationObserver instead, to observe "childList" DOM mutations on a root node and its descendants.
(If you choose this approach the mutation-summary library might come in handy.)

After a (really shallow) search, I found out that (at least for me) Google places its results in a div with id search. Below is the code of a sample extension that does the following:

  1. Registers a MutationObserver to detect the insertion od div#search into the DOM.

  2. Registers a MutationObserver to detect "childList" changes in div#search and its descendants.

  3. Whenever a <a> node is added, a function traverses the relevant nodes and modifies the links. (The script ignored <script> elements for obvious reasons.

This sample extension just encloses the link's text in ~~, but you can easily change it to do whatever you need.

manifest.json:

{
    "manifest_version": 2,
    "name":    "Test Extension",
    "version": "0.0",

    "content_scripts": [{
        "matches": [
            ...
            "*://www.google.gr/*",
            "*://www.google.com/*"
        ],
        "js":         ["content.js"],
        "run_at":     "document_end",
        "all_frames": false
    }],

}

content.js:

console.log("Injected...");

/* MutationObserver configuration data: Listen for "childList"
 * mutations in the specified element and its descendants */
var config = {
    childList: true,
    subtree: true
};
var regex = /<a.*?>[^<]*<\/a>/;

/* Traverse 'rootNode' and its descendants and modify '<a>' tags */
function modifyLinks(rootNode) {
    var nodes = [rootNode];
    while (nodes.length > 0) {
        var node = nodes.shift();
        if (node.tagName == "A") {
            /* Modify the '<a>' element */
            node.innerHTML = "~~" + node.innerHTML + "~~";
        } else {
            /* If the current node has children, queue them for further
             * processing, ignoring any '<script>' tags. */
            [].slice.call(node.children).forEach(function(childNode) {
                if (childNode.tagName != "SCRIPT") {
                    nodes.push(childNode);
                }
            });
        }
    }
}

/* Observer1: Looks for 'div.search' */
var observer1 = new MutationObserver(function(mutations) {
    /* For each MutationRecord in 'mutations'... */
    mutations.some(function(mutation) {
        /* ...if nodes have beed added... */
        if (mutation.addedNodes && (mutation.addedNodes.length > 0)) {
            /* ...look for 'div#search' */
            var node = mutation.target.querySelector("div#search");
            if (node) {
                /* 'div#search' found; stop observer 1 and start observer 2 */
                observer1.disconnect();
                observer2.observe(node, config);

                if (regex.test(node.innerHTML)) {
                    /* Modify any '<a>' elements already in the current node */
                    modifyLinks(node);
                }
                return true;
            }
        }
    });
});

/* Observer2: Listens for '<a>' elements insertion */
var observer2 = new MutationObserver(function(mutations) {
    mutations.forEach(function(mutation) {
        if (mutation.addedNodes) {
            [].slice.call(mutation.addedNodes).forEach(function(node) {
                /* If 'node' or any of its desctants are '<a>'... */
                if (regex.test(node.outerHTML)) {
                    /* ...do something with them */
                    modifyLinks(node);
                }
            });
        }
    });
});

/* Start observing 'body' for 'div#search' */
observer1.observe(document.body, config);
gkalpak
  • 47,844
  • 8
  • 105
  • 118
1

I just wrote an extension that manipulates Google search results. It seems the issue is that the results are almost always fetched via AJAX. I used MutationObserver to periodically check for changes in the results. There are two types of Google Search pages (which I have encountered thus far): Standard and Google Instant. For standard pages, you need to observe the body element (you can use selector "#gsr"), but for Google Instant, you can just look for the containing DIV ("#search"). You will want to observe childList and subtree mutations.

(Edited per @ExpertSystem's comments)

Aaron J Spetner
  • 2,117
  • 1
  • 18
  • 30