4

A webpage is setting a built-in javascript method to null, and I'm trying to find a way to call the overridden methods in a userscript.

Consider the following code:

// Overriding the native method to something else
document.querySelectorAll = null;

Now, if I try to execute document.querySelectorAll('#an-example'), I will get the exception Uncaught TypeError: null is not a function. The reason being the method has been changed to null and is no longer accessible.

I'm looking for a way to somehow restore the reference to the method in my userscript. The problem is that the website can override the reference to anything (even including the Document, Element and Object constructors).

Since the website can also easily set the reference to null, I need a way to find a way to access the querySelectorAll method that the website won't be able to override.

The challenge is that any method such as createElement and getElementsByTagName (in addition to their prototypes) can get overridden to null at the point my userscript is executed on the page.

My question is, how do I access the Document or HTMLDocument constructor methods, if they have also been overridden?


Note:

Since Tampermonkey due to browser limitations cannot run my script at the beginning of a document, I'm unable to save a reference to the method I'd like to use, with something like this:

// the following code cannot be run at the beginning of the document
var _originalQuerySelectorAll = document.querySelectorAll;
Community
  • 1
  • 1
  • For `querySelectorAll`, at least, it would be possible to code it yourself manually - essentially, a polyfill. (if, indeed, there are no references to the native function anywhere) – CertainPerformance Dec 30 '18 at 13:02
  • @CertainPerformance The issue is that the website owner will also carelessly override `getElementById` and/or `getElementsByTagName`, so I can't write a polyfill. –  Dec 30 '18 at 13:07
  • Those methods wouldn't help - I was thinking of recursively iterating over the `children` of each element. Or, of course, just [copy someone's](https://www.google.com/search?q=queryselectorall+polyfill) – CertainPerformance Dec 30 '18 at 13:08
  • 4
    Maybe you'll find your answer here [How to unset a JavaScript variable?](https://stackoverflow.com/questions/1596782/how-to-unset-a-javascript-variable) – Rafał Dec 30 '18 at 13:14
  • 2
    @Rafał I appreciate it, I didn't think unsetting a method from the `document` would restore it to the original inherited state! –  Dec 30 '18 at 13:24
  • 3
    @Unknown Do you mean one of the methods in that link worked? `delete document.querySelectorAll` would only work if the other script writer was lazy and didn't *actually* set the native `querySelectorAll` method to null. – CertainPerformance Dec 30 '18 at 22:37
  • @CertainPerformance, do you have an example of this *actually* setting the native `querySelectorAll` method to null? – Brock Adams Dec 31 '18 at 00:02
  • @BrockAdams The page writer needs to be *determined* to make the method inaccessible - `Document.prototype.querySelectorAll = null;` and `Element.prototype.querySelectorAll = null;` https://jsfiddle.net/gruLm7dw/ – CertainPerformance Dec 31 '18 at 00:28
  • If the prototype methods are overwritten before `document-start` and the situation doesn't fit the other techniques, I wonder if one could make a script that starts on some *other* page on the same domain, save a reference to the needed functions, then `fetch` the HTML from the desired URL and replace the current page with it. – CertainPerformance Dec 31 '18 at 01:00

2 Answers2

5

There are at least 3 approaches:

  1. Use the userscript sandbox. Alas, this currently only works on Greasemonkey (including version 4+) due to Tampermonkey and Violentmonkey design flaws / bugs. More below.
  2. Use @run-at document-start. Except that this too will not work on fast pages.
  3. Delete the function override. This usually works, but is liable to more interference with/from the target page. and can be blocked if the page alters the prototype of the function.


See, also, Stop execution of Javascript function (client side) or tweak it


Note that all of the script and extension examples, below, are complete working code.
And you can test them against this JS Bin page by changing:
      *://YOUR_SERVER.COM/YOUR_PATH/*
to:
      https://output.jsbin.com/kobegen*



Userscript Sandbox:

This is the preferred method and works on Firefox+Greasemonkey (including Greasemonkey 4).

When setting @grant to other than none, the script engine is supposed to run the script in a sandbox that browsers specifically provide for that purpose.

In the proper sandbox, the target page can override document.querySelectorAll or other native functions all it wants, and the userscript will see its own, completely untouched instances, regardless.

This should always work:

// ==UserScript==
// @name     _Unoverride built in functions
// @match    *://YOUR_SERVER.COM/YOUR_PATH/*
// @grant    GM_addStyle
// @grant    GM.getValue
// ==/UserScript==
//- The @grant directives are needed to restore the proper sandbox.

console.log ("document.querySelectorAll: ", document.querySelectorAll);

and yield:

document.querySelectorAll: function querySelectorAll() { [native code] }

However, both Tampermonkey and Violentmonkey do not sandbox properly, in neither Chrome nor Firefox.
The target page can tamper with the native functions a Tampermonkey script sees, even with Tampermonkey's or Violentmonkey's version of the sandbox on.
This is not just a design flaw, it is a security flaw and a vector for potential exploits.

We know that Firefox and Chrome are not the culprits since (1) Greasemonkey-4 sets up the sandbox properly, and (2) a Chrome extension sets up the "Isolated World" properly. That is, this extension:

manifest.json:

{
    "manifest_version": 2,
    "content_scripts": [ {
        "js":               [ "Unoverride.js" ],
        "matches":          [ "*://YOUR_SERVER.COM/YOUR_PATH/*" ]
    } ],
    "description":  "Unbuggers native function",
    "name":         "Native function restore slash use",
    "version":      "1"
}

Unoverride.js:

console.log ("document.querySelectorAll: ", document.querySelectorAll);

Yields:

document.querySelectorAll: function querySelectorAll() { [native code] }

as it should.



Use @run-at document-start:

Theoretically, running the script at document-start should allow the script to catch the native function before it's altered.
EG:

// ==UserScript==
// @name     _Unoverride built in functions
// @match    *://YOUR_SERVER.COM/YOUR_PATH/*
// @grant    none
// @run-at   document-start
// ==/UserScript==

console.log ("document.querySelectorAll: ", document.querySelectorAll);

And this sometimes works on slow enough pages and/or networks.

But, as the OP already noted, neither Tampermonkey nor Violentmonkey actually inject and run before any other page code, so this method fails on fast pages.

Note that a Chrome-extension content script set with "run_at": "document_start" in the manifest, does run at the correct time and/or fast enough.



Delete the function override:

If the page (mildly) overrides a function like document.querySelectorAll, you can clear the override using delete, like so:

// ==UserScript==
// @name     _Unoverride built in functions
// @match    *://YOUR_SERVER.COM/YOUR_PATH/*
// @grant    none
// ==/UserScript==

delete document.querySelectorAll;

console.log ("document.querySelectorAll: ", document.querySelectorAll);

which yields:

document.querySelectorAll: function querySelectorAll() { [native code] }

The drawbacks are:

  1. Won't work if the page alters the prototype. EG:
    Document.prototype.querySelectorAll = null;
  2. The page can see or remake such changes, especially if your script fires too soon.

Mitigate item 2 by making a private copy:

// ==UserScript==
// @name     _Unoverride built in functions
// @match    *://YOUR_SERVER.COM/YOUR_PATH/*
// @grant    none
// ==/UserScript==

var foobarFunc = document.querySelectorAll;

delete document.querySelectorAll;

var _goodfunc = document.querySelectorAll;
var goodfunc  = function (params) {return _goodfunc.call (document, params); };

console.log (`goodfunc ("body"): `, goodfunc("body") );

which yields:

goodfunc ("body"): NodeList10: body, length: 1,...

And goodfunc() will continue to work (for your script) even if the page remolests document.querySelectorAll.

Brock Adams
  • 90,639
  • 22
  • 233
  • 295
  • 2
    Thank you for your extensive write-up, I would upvote it multiple times if I could! –  Dec 31 '18 at 14:56
  • 1
    I have a question regarding the sandbox mechanism: If a userscript is _properly_ sandboxed from the main document, does that mean `document.querySelectorAll()` refer to _another_ document, or rather I can still call it on the loaded page to get its DOM Elements? –  Dec 31 '18 at 14:58
  • 2
    @Unknown, in a properly sandboxed environment, it would be the latter, your script sees the current document's DOM, unsullied by any changes the page makes to the functions. But your page does see the changes made to the HTML nodes. See [Google's introduction to the "Isolated World"](https://developer.chrome.com/extensions/content_scripts#isolated_world) and the [MDN documentation for "Xray vision"](https://developer.mozilla.org/en-US/docs/Mozilla/Tech/Xray_vision). – Brock Adams Dec 31 '18 at 18:23
  • 2
    There's an instant injection mode in Tampermonkey that supposedly is guaranteed to run before any page scripts. It uses a deprecated sync XHR to deliver the script code into the content script while totally blocking the parsing of the page at `document_start`. – wOxxOm Dec 31 '18 at 20:39
  • 2
    @wOxxOm, I had tested the instant injection mode; it still didn't fire in time. Note that the bug report, that the OP linked, is still open and implies as much. For those reasons, I didn't reiterate it in this answer. – Brock Adams Dec 31 '18 at 20:54
  • That comment is irrelevant as it was written long before the instant injection mode was implemented. The instant injection mode, when enabled in the TM dashboard, doesn't use asynchronous communication at all: it sets a cookie on document via webRequest API, then the content script gets the data via a synchronous XHR in the content script right at document_start before any page scripts can run. – wOxxOm Jan 01 '19 at 12:22
  • @wOxxOm, regardless of your exegesis of that *open* bug report, it doesn't work on neither FF 64.0 nor Chrome 71 (nor others). Try it. (1) Set "Instant" Inject mode. (2) Optionally restart browser. (3) Copy the JS Bin code to a local file. (4) Add a `@match` for that page to the TM script and install the script. (5) Open the local page. (6) you will see that the script ran too late to catch the function override. – Brock Adams Jan 01 '19 at 19:29
  • That sounds like a bug in Chrome's serving of file:// scheme (it's not served via the standard path) to extension's webRequest API because I can't repro the problem when serving the same from a localhost that opens in about the same time (4ms vs 1ms from a file). I also doubt they'll fix this soon even if I report it since userscripting over local files is an edge case. – wOxxOm Jan 01 '19 at 20:06
  • 1
    @wOxxOm, I repro the bug on cached reloads from JS Bin (and also a local server). Running at `document-start` is not something one can currently rely on unless they are making their own Chrome extension (where it's `document_start`). – Brock Adams Jan 01 '19 at 20:19
2

Additional solution in Tampermonkey is to restore the original via an iframe - assuming the site's CSP allows it, which it usually does, AFAIK.

const builtin = new Proxy(document.createElement('iframe'), {
  get(frame, p) {
    if (!frame.parentNode) {
      frame.style.cssText = 'display:none !important';
      document.documentElement.appendChild(frame);
    }
    return frame.contentWindow[p];
  }
});

Usage:

console.log(builtin.document.querySelectorAll.call(document, '*'));

P.S. If the page isn't thorough, you can access the original via the prototype without iframe trick:

Document.prototype.querySelectorAll.call(document, '*')
Element.prototype.querySelectorAll.call(normalElements, '*')
wOxxOm
  • 65,848
  • 11
  • 132
  • 136
  • 2
    This is a pretty fantastic way to do it! But I thought about it, and the downside is that the site can _also override_ the **`createElement`** method, the same way it can override `querySelectorAll`. –  Jan 02 '19 at 21:50
  • 1
    On your PS note, I'm sure they'd even override the `prototype` methods just to mess with my userscript, the same way they did with qSA method. –  Jan 02 '19 at 21:54
  • Also, +1 on using `async` method, the code is written beautifully. –  Jan 02 '19 at 21:55
  • @user8700105 I found a wordpress javascript plugins that does just that with the `String.replaceAll` method – polendina Oct 03 '22 at 23:51