0

When loading HTML content with DomDocument it gets restructured.

I know that p tags are not allowed inside h1 but this is what I have to work with. Whilst the spec says it’s not allowed everything is still correctly nested (no missing closing tag etc.)

...
<h1>
    <p>Nested paragraph</p>
</h1>
...

Then when run

$dom = new \DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($content);

It will output like so

<h1>

</h1>
<p>Nested paragraph</p>

The p has been moved outside the h1. Is there a way to tell it not to care about matching the spec but just ensure tags are closed etc. How’s this going to work with custom elements in the future?

Ric
  • 3,195
  • 1
  • 33
  • 51
  • Probably because according to https://stackoverflow.com/questions/19779519/is-it-valid-to-have-paragraph-elements-inside-of-a-heading-tag-in-html5-p-insid it's not valid. – Nigel Ren Dec 12 '17 at 16:53
  • @NigelRen thanks, I know this is not valid but that’s how I receive the HTML. I’m asking if I can stop DomDocument from editing this HTML. – Ric Dec 12 '17 at 16:56
  • If you treat it as XML - then it's OK, but this of course may cause other issues. – Nigel Ren Dec 12 '17 at 16:59
  • You could pre-process the source HTML prior to using DOMDocument, and the re-set it after you've exported it. – Nigel Ren Dec 12 '17 at 17:03

0 Answers0