1

Let's say I build an HTML fragment using the following code:

$dom = new DOMDocument();
$header = $dom->createElement("h2", "Lorem & Ipsum");
$dom->appendChild($header);
print($dom->saveHTML());

The raw HTML code printed contains the unescaped & symbol instead of the necessary HTML &. The code also throws the following PHP error:

Warning: DOMDocument::createElement(): unterminated entity reference

What's the best way to handle this?

vqdave
  • 2,361
  • 1
  • 18
  • 36
  • Dupe? https://stackoverflow.com/questions/28350112/php-domdocument-what-is-the-nicest-way-to-safely-add-text-to-an-element & https://stackoverflow.com/questions/22956330/cakephp-xml-utility-library-triggers-domdocument-warning/22957785#22957785 – mickmackusa Feb 25 '18 at 23:12

1 Answers1

0

It appears that the PHP team is not willing to change this behavior (source), so we have to find a workaround instead.

One way is to simply do the encoding yourself in the PHP code, as such:

$header = $dom->createElement("h2", "Lorem & Ipsum");

However, this isn't always convenient, as the text printed may be inside of a variable or contain other special characters besides &. So, you can use the htmlentities function.

$text = "Lorem & Ipsum";
$header = $dom->createElement("h2", htmlentities($text));

If this still is not an ideal solution, another workaround is to use the textContent property instead of the second argument in createElement.

In the code below, I've implemented this in a DOMDocument subclass, so you just have to use the BetterDOM subclass instead to fix this strange bug.

class BetterDOM extends DOMDocument {
    public function createElement($tag, $text = null) {
        $base = parent::createElement($tag);
        $base->textContent = $text;
        return $base;
    }
}

// Correctly prints "<h2>Lorem &amp; Ipsum</h2>" with no errors
$dom = new BetterDOM();
$header = $dom->createElement("h2", "Lorem & Ipsum");
$dom->appendChild($header);
print($dom->saveHTML());
vqdave
  • 2,361
  • 1
  • 18
  • 36