2

Here's the issue: I have a web page that saves HTML fragments to the server side. The problem is that in PHP, when I start the DOMDocument parser, add a custom attribute to a given element and save the HTML as a file, it literally adds the html, body, and other unnecessary elements that are clearly not going to be valid since that fragment would be going back to the browser as a HTML fragment to be inserted inside the DOM model and it would be invalid (you cannot have nested HTML/BODY). Here's a quick example of my code:

$html="<div><magic></magic>
 <video controls></video>
    <a href='http://example.com'>Example</a><br>
    <a href='http://google.com'>Google</a><br></div>
 ";

$dom = new DOMDocument();
$dom->loadHTML($html); 
$html=$dom->C14N();
echo $html;

But it shows:

<html>
<body>
<div>
<magic></magic>
<video controls=""></video>
<a href="http://example.com">Example</a>
<br></br>
<a href="http://google.com">Google</a>
<br></br>
</div>
</body>
</html>

How do I save just the fragmented HTML? I came up with $dom->C14N() but it still adds html and body tags. It also adds </br> but that's no big deal.

At this point, I am resorting to preg_replace to remove html and body tags but it would be nice if there's a way to save it as a fragment.

netrox
  • 5,224
  • 14
  • 46
  • 60

1 Answers1

1

You need to initialize the DOM structure like this:

$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$html=$dom->saveHTML();

See PHP documentation:

LIBXML_HTML_NOIMPLIED (integer)
Sets HTML_PARSE_NOIMPLIED flag, which turns off the automatic adding of implied html/body... elements.

LIBXML_HTML_NODEFDTD (integer)
Sets HTML_PARSE_NODEFDTD flag, which prevents a default doctype being added when one is not found.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Note that using the `LIBXML_HTML_NOIMPLIED` option can cause some tags not to render correctly and if so would require a workaround. See https://www.php.net/manual/en/domdocument.savehtml.php#121444 for further explanation. – Talk Nerdy To Me Jan 29 '21 at 20:40