Below is a minimal example of some HTML for which I am trying to extract the text Content. My desired outcome is the array ['keep1', 'keep2', 'keep3', 'keep4', 'keep5']
, so I am dropping anything that is a child element of the div, then splitting the div's text into an array on the <br />
tags.
Usually I would use .innerText
on the div which helpfully gets all the text and drops child elements, but as far as I am aware is not suitable in this case because then I lose the <br />
tags that I need for splitting into an array. Below is the best I could come up with, but doesn't handle cases where child elements are not surrounded by <br />
. Is there any better way to do this?
const text = document
.querySelector("div")
.innerHTML.split("<br>")
.map(e => e.trim())
.filter(e => e[0] != "<" && e != "");
console.log(text);
<div>
<br /> keep1 <br /> keep2
<span>drop</span> keep3
<br /> keep4
<br />
<h4>drop2</h4>
<br />keep5
</div>