We use a CMS on our site. Many users have added HTML content into the database that is formatted weirdly. For example, putting all their HTML on a single line:
<h1>This is my title</h1><p>First paragraph</p><p>Second paragraph</p>
This renders in the browser correctly, of course. However, I am writing a script in PHP that loads up this data into a DOMDocument like so:
$doc = new DOMDocument();
$doc->loadHTML($row['body_html']);
var_dump($doc->documentElement->textContent);
This shows up as:
This is my titleFirst paragraphSecond paragraph
How can I get documentElement
to return innerText
, rather than textContent
? I believe innerText
will return a string with line breaks.