It's a mistake to do a body.innerHTML
replacement under almost all circumstances. It may appear to work on some trivial examples, but for any non-trivial application, it will fail. See don't parse HTML with regex.
To get a taste of why this approach is a non-starter, consider if your document has the following element:
<div style="color: #AA144"></div>
How is your regex going to know not to slap a bunch of spans inside this attribute string for every A
or A1
? It won't.
Instead, use the document object model which is a tree structure representing the markup (all the HTML parsing was done for you!). Traverse the nodes in the tree and operate on each node's textContent
, performing the replacements:
for (const parent of document.querySelectorAll("body *")) {
for (const child of parent.childNodes) {
if (child.nodeType === Node.TEXT_NODE) {
const pattern = /(A1|A)/g;
const replacement = "<span style='color:#FF0000;'>$1</span>";
const subNode = document.createElement("span");
subNode.innerHTML = child.textContent.replace(pattern, replacement);
parent.insertBefore(subNode, child);
parent.removeChild(child);
}
}
}
body {
background: white;
}
<div>
foobar
<div>
<div style="color: #AA144">
foobazz A1
</div>
foo quuz AA
<h4>coraaAge</h4>
<p>
A bA bAA ello world A1
</p>
</div>
</div>
If you want to do different colors for different patterns, you can use:
const swaps = {
foo: "#f00",
bar: "#b42"
};
const pattern = RegExp(Object.keys(swaps).join("|"), "g");
const sub = m => `<span style='color:${swaps[m]};'>${m}</span>`;
for (const parent of document.querySelectorAll("body *")) {
for (const child of parent.childNodes) {
if (child.nodeType === Node.TEXT_NODE) {
const subNode = document.createElement("span");
subNode.innerHTML = child.textContent.replace(pattern, sub);
parent.insertBefore(subNode, child);
parent.removeChild(child);
}
}
}
body {
background: white;
}
<div>
foobazbarfoooo
<div>
<div style="color: #AA144">
foobazz ybarybar
</div>
foo quuz bar
<h4>corge foo</h4>
<p>
foo bar baz
</p>
</div>
</div>
Note that the order of concatenation groups in the above regexes matter. (A|A1)
will not work because the A
option will be matched first and a 1
character after it will fail to highlight.
Also, it should go without saying that performing string replacement on a tree of nodes like this is an antipattern, even if it's much more workable than a giant .innerHTML
call. This approach is still prone to serious performance and accuracy issues. In most cases, it's best to represent the data from the start using an in-memory, application-specific data structure, then format the data and generate HTML to avoid expensive and brittle re-parsing HTML.
I adapted code from another answer of mine for this. The question is not really a dupe, though.