1

I have two wrappers:

function wrapSentences(str, tmpl) {
    return str.replace(/[^\.!\?]+[\.!\?]+/g, tmpl || "<sentence>$&</sentence>")
}

and

function wrapWords(str, tmpl) {
return str.replace(/\w+/g, tmpl || "<word>$&</word>");
}

I use these in our extension to wrap every word and sentence on any webpage the user visits for TTS and settings purposes.

document.body is the most atomic element on every website, but doing body.innerHTML = wrapWords(body.innerText) will (obviously) replace any element that was in between the different text nodes, thus breaking (the visual part of) the website. I'm looking for a way to find any closest element around any text without knowing anything specific about that element, so I can replace it with a wrapped equivalent without altering the website in any way.

I found several examples that go to the deepest child, but they all rely on passing something (node or id) the extension has no way of knowing about. We will use rangy for highlighting, but have the same issue... I always end up having to pass a node or id that the extension is unable to be aware of when visiting random sites.

One of the examples that needs a node passed:

    function replaceTextNodes(node, newText) {
    if (node.nodeType === 3) {
        //Filter out text nodes that contain only whitespace
        if (!/^\s*$/.test(node.data)) {
            node.data = newText;
        }
    } else if (node.hasChildNodes()) {
        for (let i = 0, len = node.childNodes.length; i < len; ++i) {
            replaceTextNodes(node.childNodes[i], newText);
        }
    }
}

I'll be happy to explain it better if needed. I fear my wording may not always be the best, I'm aware of that.

Philip Raath
  • 460
  • 3
  • 18
Silvio Langereis
  • 487
  • 1
  • 11
  • 20
  • So what is the issue if you use a recursive function like you presented at the end and pass it `document.body`? – trincot Oct 10 '17 at 14:54
  • I have nothing to pass as newText. I need to know the contents of every textnode separately before I can wrap it and pass it as newText. Passing body.innerText would result in every textnode containing the whole body's text. – Silvio Langereis Oct 10 '17 at 15:08
  • Of course, but I assumed you only provided that function as a model. So you are asking us to write a similar function, but that performs the wrapping? – trincot Oct 10 '17 at 15:13
  • I simply needs a function that returns every deepest node containing text so that I can wrap it and return the wrapped text to that same node. – Silvio Langereis Oct 10 '17 at 15:19
  • It sounds easier in my head than I'm able to write out, sorry. – Silvio Langereis Oct 10 '17 at 15:20
  • Additional related: [Change matching words in a webpage's text to buttons](https://stackoverflow.com/q/40572679), [Replace text with link with chrome extension](https://stackoverflow.com/q/40276158) – Makyen Oct 11 '17 at 03:16
  • Thanks Makyen. It's weird that I haven't found your answers on SO though. Anyhow it did the trick ! – Silvio Langereis Oct 11 '17 at 08:20

1 Answers1

1

It looks like what you want is all the text nodes on the page... This question might have your answer.
Using the function from the first answer:

Edit: now wrapping text in <word> nodes, not just their textContent

function textNodesUnder(el){
  var n, a=[], walk=document.createTreeWalker(el,NodeFilter.SHOW_TEXT,null,false);
  while(n=walk.nextNode()) a.push(n);
  return a;
}

exp = /(?:(\W+)|(\w+))/g

textNodesUnder(document.body)
    .filter(t => !/^\s*$/.test(t.textContent))
    .forEach(t => {
        let s = t.textContent, match
        while(match = exp.exec(s)) {
            let el
            if(match[1] !== undefined) {
                el = document.createTextNode(match[1])
            }
            else {
                el = document.createElement("word")
                el.textContent = match[2]
            }
            t.parentNode.insertBefore(el, t)
        }
        t.parentElement.removeChild(t)
    })
Jean-Alphonse
  • 800
  • 4
  • 10