I have two wrappers:
function wrapSentences(str, tmpl) {
return str.replace(/[^\.!\?]+[\.!\?]+/g, tmpl || "<sentence>$&</sentence>")
}
and
function wrapWords(str, tmpl) {
return str.replace(/\w+/g, tmpl || "<word>$&</word>");
}
I use these in our extension to wrap every word and sentence on any webpage the user visits for TTS and settings purposes.
document.body is the most atomic element on every website, but doing body.innerHTML = wrapWords(body.innerText)
will (obviously) replace any element that was in between the different text nodes, thus breaking (the visual part of) the website. I'm looking for a way to find any closest element around any text without knowing anything specific about that element, so I can replace it with a wrapped equivalent without altering the website in any way.
I found several examples that go to the deepest child, but they all rely on passing something (node or id) the extension has no way of knowing about. We will use rangy for highlighting, but have the same issue... I always end up having to pass a node or id that the extension is unable to be aware of when visiting random sites.
One of the examples that needs a node passed:
function replaceTextNodes(node, newText) {
if (node.nodeType === 3) {
//Filter out text nodes that contain only whitespace
if (!/^\s*$/.test(node.data)) {
node.data = newText;
}
} else if (node.hasChildNodes()) {
for (let i = 0, len = node.childNodes.length; i < len; ++i) {
replaceTextNodes(node.childNodes[i], newText);
}
}
}
I'll be happy to explain it better if needed. I fear my wording may not always be the best, I'm aware of that.