1

Considering this code:

<div class="a">foo</div>
<div class="a"><div id="1">bar</div></div>

If I want to fetch all the values of divs with class a, I'll do the following query:

$q = $xpath->query('//div[@class="a"]');

However, I'll get this result:

foo
bar

But I want to get the actual value including the children tags. So it'll look like that:

foo
<div id="1">bar</div>

How can I accomplish that with XPath and DOMDocument only?


Solved by the function provided here.

Community
  • 1
  • 1
oh_shi
  • 11
  • 2
  • 1
    [See](http://stackoverflow.com/questions/7156054/getting-the-inner-html-of-a-domelement-in-php) [countless](http://stackoverflow.com/questions/4879946/domdocument-savehtml-without-html-wrapper) [other](http://stackoverflow.com/questions/3615389/innerhtml-in-xpath) [questions](http://stackoverflow.com/questions/2087103/innerhtml-in-phps-domdocument) asking the same thing. – salathe Aug 25 '11 at 20:37

3 Answers3

0

Try something like:

$doc = new DOMDocument;
$doc->loadHTML('<div>Your HTML here.</div>');
$xpath = new DOMXpath($doc);
$node = $xpath->query('//div[@class="a"]')->item(0);
$html = $node->ownerDocument->saveHTML($node); // Get HTML of DOMElement.
kenorb
  • 155,785
  • 88
  • 678
  • 743
0

You can try to use

$xml = '<?xml version=\'1.0\' encoding=\'UTF-8\' ?>
    <root>
    <div class="a">foo</div>
    <div class="a"><div id="1">bar</div></div>
    </root>';

$xml = simplexml_load_string($xml);            
var_dump($xml->xpath('//div[@class="a"]'));

But in this case you will have to iterate objects.

Output:

array(2) { [0]=> object(SimpleXMLElement)#2 (2) { ["@attributes"]=> array(1) { ["class"]=> string(1) "a" } [0]=> string(3) "foo" } [1]=> object(SimpleXMLElement)#3 (2) { ["@attributes"]=> array(1) { ["class"]=> string(1) "a" } ["div"]=> string(3) "bar" } }

Andrej
  • 7,474
  • 1
  • 19
  • 21
  • But it also retrieves only bar in my case. – oh_shi Aug 25 '11 at 20:27
  • 1
    Sorry, you were right. However, it has no the `id` attribute (or am I missing something?). I mean, the `
    `. And the structure is pretty complex. Even if it was showing me the `id` attribute, I'd still have to transform all that stuff to a plain text.
    – oh_shi Aug 25 '11 at 20:36
0

PHP DOM has an undocumented '.nodeValue' attribute which acts exactly like .innerHTML in a browser. Once you've used XPath to get the node you want, just do $node->nodeValue to get the innerhtml.

Marc B
  • 356,200
  • 43
  • 426
  • 500
  • 1
    That's **exactly** what I do. But nodeValue does not interpret the children as actual value of the element. – oh_shi Aug 25 '11 at 20:24