0

With this code:

<?php
    $s = '<h1>Header</h1>';
    $dom = new DOMDocument();
    $dom->loadHTML($s, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
    var_dump($dom->documentElement->childNodes->item(0));
?>

On my development machine,the var_dump spits out a DOMText object, yet on my production machine it's returning a DOMElement object.

Production server is running php 5.4.33 with libxml 2.7.8

Dev machine is running php 5.4.4 with libxml 2.8.0

danbroooks
  • 2,712
  • 5
  • 21
  • 43

1 Answers1

1

It's probably got something to do with different PHP versions: The DOMDocument::loadHTML method only accepted a second ($options) argument since PHP 5.4, it's listed on the changelog:

5.4.0 DOMDocument::loadHTML Added options parameter.
DOMDocument::loadHTMLFile Added options parameter.

The changes are also listed on the DOMDocument::loadHTML doc pages

Update:

After some digging, I found a lot of contradictory information concerning the LIBXML_HTML_NODEFDTD constant. according to the docs, it's available in >= libxml 2.7.7, whereas other sources contradict this. I have found some projects that define this constant manually, and an answer here that states that this constant is only available in libxml 2.7.8.
This could explain the difference between your 2 environments. An open-source project on github tackles this issue by simply defining the constant if needed:

defined('LIBXML_HTML_NODEFDTD') || define ('LIBXML_HTML_NODEFDTD', 4);
Community
  • 1
  • 1
Elias Van Ootegem
  • 74,482
  • 9
  • 111
  • 149
  • Sorry - added the versions to my question, both machines are running php 5.4 – danbroooks Oct 13 '14 at 10:05
  • You are correct though, as saveHTML() on my server spits out `

    Header

    `, so it is obviously ignoring the options argument...
    – danbroooks Oct 13 '14 at 10:11
  • @danbroooks: I'm scanning through a couple of bug reports, and the libxml changelogs now. There have been people reporting similar behavior to that which you seem to encounter... using xpaths is the common workaround ATM, though – Elias Van Ootegem Oct 13 '14 at 11:10
  • 1
    @danbroooks: added some findings after a quick google sessions – Elias Van Ootegem Oct 13 '14 at 12:19
  • Thanks for doing some digging for me! However defining the constant doesn't seem to do much, and looking at the values of the constants between machines show they both hold the same values (4 & 8192). – danbroooks Oct 13 '14 at 13:21
  • To be honest it looks like `LIBXML_HTML_NOIMPLIED` is the argument that is refusing to work... with `LIBXML_HTML_NODEFDTD` removed the doctype is added as you would expect. – danbroooks Oct 13 '14 at 14:51