This a slightly open ended question but i am hoping someone can help. Here is what I am trying to do.
I am writing a browser extension that parses all the HTML of the page, calls an API, waits for a response and then replaces all the text on the page with the text from the API response without changing the format of the page.
I have tried several different approaches to parsing the HTML of the page -
Grab document.body.getElementsByTagName("*") and then try to process some of those elements at a time. The disadvantage is that I end up with a list of all the elements of the page which includes the parents and the children, so I end up calling the API on the same element twice
Grab document.body.childnodes and then work my way down each child node. This isn't working because some child nodes tend to be too big and end up overflowing the API.
So my open ended question is this - If you have to traverse the DOM of any page on the web and replace all the text with another chunk of text, what is the best approach to take?