37

Is there any way I can insert an HTML template to existing DOMNode without content being encoded?

I have tried to do that with:

$dom->createElement('div', '<h1>Hello world</h1>');
$dom->createTextNode('<h1>Hello world</h1>');

The output is pretty much the same, with only difference that first code would wrap it in a div. I have tried to loadHTML from string but I have no idea how can I append it's body content to another DOMDocument.

In javascript, this process seems to be quite simple and obvious.

RyanS
  • 627
  • 1
  • 10
  • 26
Nazariy
  • 6,028
  • 5
  • 37
  • 61

5 Answers5

52

You can use

Example:

// just some setup
$dom = new DOMDocument;
$dom->loadXml('<html><body/></html>');
$body = $dom->documentElement->firstChild;

// this is the part you are looking for    
$template = $dom->createDocumentFragment();
$template->appendXML('<h1>This is <em>my</em> template</h1>');
$body->appendChild($template);

// output
echo $dom->saveXml();

Output:

<?xml version="1.0"?>
<html><body><h1>This is <em>my</em> template</h1></body></html>

If you want to import from another DOMDocument, replace the three lines with

$tpl = new DOMDocument;
$tpl->loadXml('<h1>This is <em>my</em> template</h1>');
$body->appendChild($dom->importNode($tpl->documentElement, TRUE));

Using TRUE as the second argument to importNode will do a recursive import of the node tree.


If you need to import (malformed) HTML, change loadXml to loadHTML. This will trigger the HTML parser of libxml (what ext/DOM uses internally):

libxml_use_internal_errors(true);
$tpl = new DOMDocument;
$tpl->loadHtml('<h1>This is <em>malformed</em> template</h2>');
$body->appendChild($dom->importNode($tpl->documentElement, TRUE));
libxml_use_internal_errors(false);

Note that libxml will try to correct the markup, e.g. it will change the wrong closing </h2> to </h1>.

Gordon
  • 312,688
  • 75
  • 539
  • 559
  • I like first example with createDocumentFragment it looks one command shorter than second, and just from curiosity which of the two are more memory/time efficient? – Nazariy Dec 09 '10 at 18:32
  • @Nazariy I have no clue. I've never put them against each other and in all the profilings I have done on applications, I've never noticed them to be an issue. – Gordon Dec 09 '10 at 18:39
  • This appends XML, not HTML, though? From other answers I've read, it would generate errors since HTML is not the same thing as "well formed XML". – Nate Apr 11 '15 at 21:14
  • 2
    @Nate `appendXml` expects well formed XML. If you want to append malformed html, you have to adapt the second approach to use the html loader. contrary to popular belief libxml can parse malformed markup to a good extent. – Gordon Apr 12 '15 at 07:49
  • 2
    @AaronGillion I am not sure what you mean. [This works fine on "nested HTML"](http://codepad.org/1BdMONzX). Can you provide a small example where you think it doesnt work. I'll help you figure it out then. – Gordon Jun 02 '15 at 06:27
  • @AaronGillion This worked just fine for me with some nested HTML. I think you need to check yourself before you wreck yourself. – James Jones Dec 23 '16 at 14:42
  • Small note: `loadXml` won't work with html entities - they're not known in xml and apparently `appendXml` doesn't like that (even if the owner document is html). (f.e. I was getting problems with appending html containing `…`) – Christian Jul 14 '22 at 17:53
46

It works with another DOMDocument for parsing the HTML code. But you need to import the nodes into the main document before you can use them in it:

$newDiv = $dom->createElement('div');
$tmpDoc = new DOMDocument();
$tmpDoc->loadHTML($str);
foreach ($tmpDoc->getElementsByTagName('body')->item(0)->childNodes as $node) {
    $node = $dom->importNode($node, true);
    $newDiv->appendChild($node);
}

And as a handy function:

function appendHTML(DOMNode $parent, $source) {
    $tmpDoc = new DOMDocument();
    $tmpDoc->loadHTML($source);
    foreach ($tmpDoc->getElementsByTagName('body')->item(0)->childNodes as $node) {
        $node = $parent->ownerDocument->importNode($node, true);
        $parent->appendChild($node);
    }
}

Then you can simply do this:

$elem = $dom->createElement('div');
appendHTML($elem, '<h1>Hello world</h1>');
Mark
  • 765
  • 8
  • 12
Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • Thanks Gumbo, I thought about it to, but I'm stack a little with getElementByTagName as in my case there can be any html content, so I have to extract all child nodes of body element, how can I do that? – Nazariy Dec 09 '10 at 17:19
  • 1
    @Nazariy: In that case get the BODY and do the import and append for all its children. – Gumbo Dec 09 '10 at 17:29
  • 10
    the code in the foreach doesn't work.. because importNode() returns the reference to the NEW node that belongs to the original DOMDocument, and therefore, to be able to append it you should keep that reference.. what you are currently doing is imoprting the node and trying to append the tmpDoc's node to the $parent that belongs to the original document. The correct procedure would be: `$importedNode = $parent->ownerDocument->importNode($node, TRUE); $parent->appendChild($importedNode);` – KnF Feb 27 '13 at 04:12
  • It seems that if `$str` is mainly a `` then it goes into the `` tag in the `$tmpDoc` – Timo Huovinen Jun 08 '13 at 15:56
  • 3
    A problem I ran into with this solution is mentioned in the comments for [DOMDocument::loadHTML()](http://www.php.net/manual/en/domdocument.loadhtml.php#88864), where if `$str` (or `$source`, in your second example) contains any text **not** contained in an HTML element, it will get wrapped in `

    ` tags. Any way of stopping `loadHTML()` from putting `

    ` tags on actual text nodes?

    – Travesty3 Jun 17 '13 at 20:56
  • 1
    the other solution is just simpler as it seems to me – Toskan Jan 10 '14 at 09:43
  • @KnF is correct. The code didn't work until modified per his explanation. – Nate Apr 11 '15 at 21:30
  • ^^^The only answer that works for nested HTML on this page, after applying the fix above. Also `importNode($node,TRUE)` to get all the elements. – Aaron Gillion Jun 02 '15 at 04:34
  • went ahead and edited answer to include `$node = [...]->importNode($node);` – user3338098 Aug 17 '15 at 20:09
  • this handy function saved me a bunch of hours. thanks! – Matteus Barbosa Feb 08 '19 at 16:38
  • When I try to add HTML that contains the ` – Rafaucau Dec 07 '20 at 23:55
  • Why does it have to be so RIDICULOUSLY complicated? "Put this HTML in there" involves ME having to parse the HTML and insert it node by node? – Szczepan Hołyszewski Aug 15 '22 at 13:13
  • @Travesty3: Not that I know of, but you can modify the code example to append the nodes of that

    element and thereby omitting it: `foreach ($tmpDoc->getElementsByTagName('body')->item(0)->firstChild->childNodes as $node) {`

    – Simon Jun 15 '23 at 11:39
  • If you have encoding problems, prepend the encoding when loading the html `$tmpDoc->loadHTML(''.$source);`, otherwise your string will be treated as ISO-8859-1, @see https://stackoverflow.com/a/8218649/208746 – Simon Jun 15 '23 at 11:48
12

As I do not want to struggle with XML, because it throws errors faster and I am not a fan of prefixing an @ to prevent error output. The loadHTML does the better job in my opinion and it is quite simple as that:

$doc = new DOMDocument();
$div = $doc->createElement('div');

// use a helper to load the HTML into a string
$helper = new DOMDocument();
$helper->loadHTML('<a href="#">This is my HTML Link.</a>');

// now the magic!
// import the document node of the $helper object deeply (true)
// into the $div and append as child.
$div->appendChild($doc->importNode($helper->documentElement, true));

// add the div to the $doc
$doc->appendChild($div);

// final output
echo $doc->saveHTML();
Markus Zeller
  • 8,516
  • 2
  • 29
  • 35
4

Here is simple example by using DOMDocumentFragment:

$doc = new DOMDocument();
$doc->loadXML("<root/>");
$f = $doc->createDocumentFragment();
$f->appendXML("<foo>text</foo><bar>text2</bar>");
$doc->documentElement->appendChild($f);
echo $doc->saveXML();

Here is helper function for replacing DOMNode:

/** 
 * Helper function for replacing $node (DOMNode) 
 * with an XML code (string) 
 * 
 * @var DOMNode $node 
 * @var string $xml 
 */ 
public function replaceNodeXML(&$node, $xml) { 
  $f = $this->dom->createDocumentFragment(); 
  $f->appendXML($xml); 
  $node->parentNode->replaceChild($f,$node); 
}

Source: Some old "PHP5 Dom Based Template" article.

And here is another suggestion posted by Pian0_M4n to use value attribute as workaround:

$dom = new DomDocument;

// main object
$object = $dom->createElement('div');

// html attribute
$attr = $dom->createAttribute('value');
// ugly html string
$attr->value = "<div>&nbsp; this is a really html string &copy;</div><i></i> with all the &copy; that XML hates!";
$object->appendChild($attr);

// jquery fix (or javascript as well)
$('div').html($(this).attr('value')); // and it works! 
$('div').removeAttr('value'); // to clean-up

No ideal, but at least it works.

Community
  • 1
  • 1
kenorb
  • 155,785
  • 88
  • 678
  • 743
-1

Gumbo's code works perfectly! Just a little enhancement that adding the TRUE parameter so that it works with nested html snippets.

$node = $parent->ownerDocument->importNode($node);
$node = $parent->ownerDocument->importNode($node, **TRUE**);
hailong
  • 1,409
  • 16
  • 19