How to get innerHTML of DOMNode?

Question

What function do you use to get innerHTML of a given DOMNode in the PHP DOM implementation? Can someone give reliable solution?

Of course outerHTML will do too.

score 172 · Accepted Answer · edited Oct 19 '16 at 17:20

172

Compare this updated variant with PHP Manual User Note #89718:

<?php 
function DOMinnerHTML(DOMNode $element) 
{ 
    $innerHTML = ""; 
    $children  = $element->childNodes;

    foreach ($children as $child) 
    { 
        $innerHTML .= $element->ownerDocument->saveHTML($child);
    }

    return $innerHTML; 
} 
?>

Example:

<?php 
$dom= new DOMDocument(); 
$dom->preserveWhiteSpace = false;
$dom->formatOutput       = true;
$dom->load($html_string); 

$domTables = $dom->getElementsByTagName("table"); 

// Iterate over DOMNodeList (Implements Traversable)
foreach ($domTables as $table) 
{ 
    echo DOMinnerHTML($table); 
} 
?>

edited Oct 19 '16 at 17:20

Leo

10,407
3
45
62

answered Jan 18 '10 at 15:38

Haim Evgi

123,187
45
217
223

1

Thanks. It works fine. Shouldn't $dom->preserveWhiteSpace = false; be before document load? – Dawid Ohia Jan 18 '10 at 18:59
1

@JohnM2: [Yes it should](http://stackoverflow.com/questions/798967/php-simplexml-how-to-save-the-file-in-a-formatted-way). – hakre Jun 23 '13 at 18:35
2

Additional notes: Since PHP 5.3.6 you can spare the temporary `DOMDocument`. Also one might want to replace the `trim` with an `ltrim` (or even remove it completely) to preserve a bit of the whitespace like line-breaks. – hakre Jun 23 '13 at 22:01
1

A function like this should be added to the DomDocument class. – Nate Aug 24 '14 at 15:26
6

I had to change the function declaration to expect a `DOMElement` instead of a `DOMNode` as I was passing the return from `DOMDocument::getElementById()`. Just in case it trips someone else up. – miken32 Oct 04 '14 at 22:08
Why do you loop through all the children? Couldn't you just `saveHTML()` on the `$table`? Look: [PHP outerHTML S/O](http://stackoverflow.com/questions/5404941/how-to-return-outer-html-of-domdocument) – Aaron Gillion Jun 01 '15 at 06:32
This doesn't work. The error is "DOMDocument::saveHTML() expects exactly 0 parameters, 1 given" – machineaddict Dec 11 '15 at 11:40
Notice: this does not return original, but little modified HTML, for example having `11
22
33` you will not get exact version. – ViliusL Oct 12 '22 at 07:56

score 38 · Answer 2 · answered Aug 28 '16 at 16:38

38

Here is a version in a functional programming style:

function innerHTML($node) {
    return implode(array_map([$node->ownerDocument,"saveHTML"], 
                             iterator_to_array($node->childNodes)));
}

answered Aug 28 '16 at 16:38

trincot

317,000
35
244
286

2

This version gives me joy! So clean and efficient – Uchenna Ajah Mar 24 '23 at 22:34

score 18 · Answer 3 · answered May 13 '16 at 19:25

18

To return the html of an element, you can use C14N():

$dom = new DOMDocument();
$dom->loadHtml($html);
$x = new DOMXpath($dom);
foreach($x->query('//table') as $table){
    echo $table->C14N();
}

answered May 13 '16 at 19:25

Pedro Lobito

94,083
31
258
268

4

C14N will attempt to convert the HTML to a valid XML. For example
will become
– ajaybc May 18 '16 at 04:05
1

It's a dirty way of dump the HTML of the element, without having to use saveHTML that will output html, head and body tags. – Pedro Lobito May 18 '16 at 14:53
echo utf8_decode($table->C14N()); – Vit Sep 21 '22 at 16:38

Alf Eaton · Answer 4 · 2016-06-28T14:55:58.023

A simplified version of Haim Evgi's answer:

<?php

function innerHTML(\DOMElement $element)
{
    $doc = $element->ownerDocument;

    $html = '';

    foreach ($element->childNodes as $node) {
        $html .= $doc->saveHTML($node);
    }

    return $html;
}

Example usage:

<?php

$doc = new \DOMDocument();
$doc->loadHTML("<body><div id='foo'><p>This is <b>an <i>example</i></b> paragraph<br>\n\ncontaining newlines.</p><p>This is another paragraph.</p></div></body>");

print innerHTML($doc->getElementById('foo'));

/*
<p>This is <b>an <i>example</i></b> paragraph<br>

containing newlines.</p>
<p>This is another paragraph.</p>
*/

There's no need to set preserveWhiteSpace or formatOutput.

score 5 · Answer 5 · answered Oct 05 '16 at 08:21

In addition to trincot's nice version with array_map and implode but this time with array_reduce:

return array_reduce(
   iterator_to_array($node->childNodes),
   function ($carry, \DOMNode $child) {
        return $carry.$child->ownerDocument->saveHTML($child);
   }
);

Still don't understand, why there's no reduce() method which accepts arrays and iterators alike.

score 3 · Answer 6 · answered Jun 05 '14 at 18:55

3

function setnodevalue($doc, $node, $newvalue){
  while($node->childNodes->length> 0){
    $node->removeChild($node->firstChild);
  }
  $fragment= $doc->createDocumentFragment();
  $fragment->preserveWhiteSpace= false;
  if(!empty($newvalue)){
    $fragment->appendXML(trim($newvalue));
    $nod= $doc->importNode($fragment, true);
    $node->appendChild($nod);
  }
}

answered Jun 05 '14 at 18:55

Chris

27
3

You do not need to pass the document to the function, you can use `$node->ownerDocument`. – Keyacom Mar 26 '23 at 17:41

birgire · Answer 7 · 2018-12-14T10:12:24.533

Here's another approach based on this comment by Drupella on php.net, that worked well for my project. It defines the innerHTML() by creating a new DOMDocument, importing and appending to it the target node, instead of explicitly iterating over child nodes.

InnerHTML

Let's define this helper function:

function innerHTML( \DOMNode $n, $include_target_tag = true ) {
  $doc = new \DOMDocument();
  $doc->appendChild( $doc->importNode( $n, true ) );
  $html = trim( $doc->saveHTML() );
  if ( $include_target_tag ) {
      return $html;
  }
  return preg_replace( '@^<' . $n->nodeName .'[^>]*>|</'. $n->nodeName .'>$@', '', $html );
}

where we can include/exclude the outer target tag through the second input argument.

Usage Example

Here we extract the inner HTML for a target tag given by the "first" id attribute:

$html = '<div id="first"><h1>Hello</h1></div><div id="second"><p>World!</p></div>';
$doc  = new \DOMDocument();
$doc->loadHTML( $html );
$node = $doc->getElementById( 'first' );

if ( $node instanceof \DOMNode ) {

    echo innerHTML( $node, true );
    // Output: <div id="first"><h1>Hello</h1></div>    

    echo innerHTML( $node, false );
    // Output: <h1>Hello</h1>
}

Live example:

http://sandbox.onlinephpfunctions.com/code/2714ea116aad9957c3c437d46134a1688e9133b8

score 1 · Answer 8 · answered Mar 13 '20 at 15:49

Old query, but there is a built-in method to do that. Just pass the target node to DomDocument->saveHtml().

Full example:

$html = '<div><p>ciao questa è una <b>prova</b>.</p></div>';
$dom = new DomDocument($html);
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$node = $xpath->query('.//div/*'); // with * you get inner html without surrounding div tag; without * you get inner html with surrounding div tag
$innerHtml = $dom->saveHtml($node);
var_dump($innerHtml);

Output: ciao questa è una prova.

Warning: DOMDocument::saveHTML() expects parameter 1 to be DOMNode, object given — Ivan Gusev, Jun 16 '20 at 08:40

score 1 · Answer 9 · edited Oct 26 '21 at 03:37

1

For people who want to get the HTML from XPath query, here is my version:

$xpath = new DOMXpath( $my_dom_object );

$DOMNodeList = $xpath->query('//div[contains(@class, "some_custom_class_in_html")]');

if( $DOMNodeList->count() > 0 ) {
    $page_html = $my_dom_object->saveHTML( $DOMNodeList->item(0) );
}

edited Oct 26 '21 at 03:37

Dharman

30,962
25
85
135

answered Oct 26 '21 at 03:31

Ellery Leung

617
3
10
23

Rahul Mittal · Answer 10 · 2023-03-07T06:24:51.067

0

innerHTML using C14N() and xpath query:


$node->C14N(
   true, // parse only xpath query nodes
   false, // without comments
   ["query" => ".//node()|.//*//@*"] // select all inner nodes & attributes
);

edited Mar 07 '23 at 06:24

answered Mar 07 '23 at 05:34

Rahul Mittal

1
1
2

Keyacom · Answer 11 · 2023-03-28T18:25:55.253

Edit (PHP 8)

mb_convert_encoding with HTML-ENTITIES is deprecated in PHP 8.

function setInnerHTML($element, $content) {
    $DOMInnerHTML = new DOMDocument();
    $DOMInnerHTML->loadHTML(
        <<<HTML
        <html>
            <head>
                <meta charset="utf-8">
            </head>
            <body>
                $content
            </body>
        </html>
        HTML,
    );
    foreach (
        $DOMInnerHTML->getElementsByTagName('body')->item(0)->childNodes
        as $contentNode
    ) {
        $contentNode = $element->ownerDocument->importNode($contentNode, true);
        $element->appendChild($contentNode);
    }
}

Including an HTML boilerplate is probably the best way to achieve clean, UTF-8 encoded text within the added DOM nodes. I've tried creating a drop-in replacement for mb_convert_encoding with HTML-ENTITIES, but I always ended up with mojibake.

Original

After experimenting with some implementations I found here, I engineered the perfect solution that you can use to set inner HTML:

function setInnerHTML($element, $content) {
    $DOMInnerHTML = new DOMDocument();
    $DOMInnerHTML->loadHTML(
        mb_convert_encoding("<div>$content</div>", 'HTML-ENTITIES', 'UTF-8')
    );
    foreach (
        $DOMInnerHTML->getElementsByTagName('div')->item(0)->childNodes
        as $contentNode
    ) {
        $contentNode = $element->ownerDocument->importNode($contentNode, true);
        $element->appendChild($contentNode);
    }
}

Notes:

Because of the mb_convert_encoding function, this also requires the mbstring extension. If you omit the call here, this might cause mojibake.
This creates a <div> element to prevent creating an implicit  if there is no root element. This prevents problems when embedding into an element like <title>.
To not create a DocumentFragment, this fetches a DOMNodeList of the nodes, iterates through it, and appends each node to the element.
Ideally, setters should not return a value.

I created this to implement a basic templating system into a school project of mine.

How to get innerHTML of DOMNode?

11 Answers11

InnerHTML

Usage Example

Edit (PHP 8)

Original

Linked

Related