How can I replace the text on a webpage, including text that is injected or modified with future JavaScript calls? All the answers in replace words in the body text only work on text that is on the page at the moment of execution.
1 Answers
It turns out that doing the above in a performant way and without breaking anything is nontrivial for the supposedly declarative markup language that is HTML. I have documented what I learned over a month of testing and experimenting below.
To do an initial round of replacement on existing text, we will leverage TreeWalker
to go through every Text
node in the document and process their contents. In this example, I will be censoring "heck" with "h*ck".
const callback = text => text.replaceAll(/heck/gi, 'h*ck');
function processNodes(root) {
const nodes = document.createTreeWalker(
root, NodeFilter.SHOW_TEXT, { acceptNode:
node => valid(node) ? NodeFilter.FILTER_ACCEPT : NodeFilter.FILTER_REJECT
});
while (nodes.nextNode()) {
nodes.currentNode.nodeValue = callback(nodes.currentNode.nodeValue);
}
}
function valid(node) {
return (
node.parentNode !== null
&& node.parentNode.tagName !== 'SCRIPT'
&& node.parentNode.tagName !== 'STYLE'
&& !node.parentNode.isContentEditable
);
}
processNodes(document.body);
Note the valid
function. This is to handle three exceptional cases:
- We need to check that the parent node exists as sometimes the node will get removed from the document by the time we get around to it
- Messing with
<script>
and<style>
tags could break functionality or presentation - Editing a
contenteditable
element resets the cursor position which is a terrible user experience
But that only takes care of text that was already on the page. To watch for future changes, we can use MutationObserver
to watch for added or modified text nodes.
const IGNORED = [
Node.CDATA_SECTION_NODE,
Node.PROCESSING_INSTRUCTION_NODE,
Node.COMMENT_NODE,
];
const CONFIG = {subtree: true, childList: true, characterData: true};
const observer = new MutationObserver((mutations, observer) => {
observer.disconnect();
for (const mutation of mutations) {
const target = mutation.target;
switch (mutation.type) {
case 'childList':
for (const node of mutation.addedNodes) {
if (node.nodeType === Node.TEXT_NODE) {
if (valid(node)) {
node.nodeValue = callback(node.nodeValue);
}
} else if (!IGNORED.includes(node.nodeType)) {
processNodes(node);
}
}
break;
case 'characterData':
if (!IGNORED.includes(target.nodeType) && valid(target)) {
target.nodeValue = callback(target.nodeValue);
}
break;
}
}
observer.observe(document.body, CONFIG);
});
observer.observe(document.body, CONFIG);
The observer's callback consists of two main parts: a case for childList
that processes any new subtrees and text nodes as well as a case for characterData
that handles text nodes that had their contents changed. We must turn off the observer before making any edits of our own to avoid triggering an infinite loop. Also note the IGNORED
array; this is necessary because certain nodes fall under the Text
interface but are not front-facing, user-visible content.
Putting those two pieces together should be enough 98% percent of the time. However, there are still many special cases we didn't consider:
- Certain HTML attributes that get rendered as text (
placeholder
in<input>
,alt
in<img>
,value
in<input type="button">
) - CSS
content
property - Shadow DOMs
A proper explanation of workarounds for the above wouldn't fit in a StackOverflow answer, but I have written a free library called TextObserver that does it for you.

- 83
- 1
- 7