I am parsing a website's HTML and there is a 'table' inside an 'a':
<?php
$dom = new DOMDocument;
$dom->loadHTML("<!DOCTYPE html>
<html>
<head></head>
<body>
<a>
<table><tr><td></td></tr></table>
</a>
</body>
</html>");
if ($dom->getElementsByTagName("table")->item(0)->parentNode->nodeName == "body")
echo "Why is table a child of 'body'? It should be a child of 'a'.";
I also get this warning:
PHP Warning: DOMDocument::loadHTML(): Unexpected end tag : a in Entity, line: ...
I am using PHP 7.4.
I know 'table's are not officially allowed inside 'a's. BUT:
- The warning is a completely different message.
- Making the 'table' a child of 'body' because I've put it inside an 'a' does not make sense.
What can I do? I want that at least the table is not a child of body. Because like this I cannot parse sites properly.
Edit: Please read the comments under this question. Tables are allowed inside 'a's in this case in HTML5. So this behavior is even more strange.