4

This is my example script:

$html = <<<HTML
<div class="main">
    <div class="text">
    Capture this text 1
    </div>
    <div class="date">
    May 2010
    </div>
</div>
<div class="main">
    <div class="text">
    Capture this text 2
    </div>
    <div class="date">
    June 2010
    </div>
</div>
HTML;

$dom = new DOMDocument();
$dom->loadHTML($html);

$xpath = new DOMXPath($dom);


$tags = $xpath->query('//div[@class="main"]');
foreach ($tags as $tag) {
    print_r($tag->nodeValue."\n");
}

This will out put:

Capture this text 1 May 2010
Capture this text 2 June 2010 

But I need it output:

<div class="text">
Capture this text 2
</div>
<div class="date">
June 2010
</div>

Or atleast be able to do something like this in my foreach loop:

$text = $tag->query('//div[@class="text"]')->nodeValue;
$date = $tag->query('//div[@class="date"]')->nodeValue;
alex
  • 479,566
  • 201
  • 878
  • 984
benjovanic
  • 537
  • 1
  • 5
  • 18
  • This question is not about XPath expression but about specific DOM implementation methods. –  Sep 27 '10 at 15:26

2 Answers2

7

Well, nodeValue will give you the node's value. You want what's commonly called outerHTML

echo $dom->saveXml($tag);

will output what you are looking for in an X(HT)ML compliant way.


As of PHP 5.3.6 you can also pass a node to saveHtml, which wasnt possible previously:

echo $dom->saveHtml($tag);

The latter will obey HTML4 syntax. Thanks to Artefacto for that.

Community
  • 1
  • 1
Gordon
  • 312,688
  • 75
  • 539
  • 559
  • Combine this one with a smidgin of JapanPro's answer as to `innerHTML`, and we could have `$result = '';foreach($tag->childNodes as $tag) $result.=$dom->saveXML($tag);` with the original XPath. – Wrikken Sep 27 '10 at 18:53
-1

try this

$dom = new DOMDocument();
$dom->loadHTML($html);

$xpath = new DOMXPath($dom);

$tags = $xpath->query('//div[@class="main"]');

foreach ($tags as $tag) {
    $innerHTML = '';

    $children = $tag->childNodes;
    foreach ($children as $child) {
        $tmp_doc = new DOMDocument();
        $tmp_doc->appendChild($tmp_doc->importNode($child,true));       
        $innerHTML .= $tmp_doc->saveHTML();
    }

    var_dump(trim($innerHTML));
}

-Pascal MARTIN

Pramendra Gupta
  • 14,667
  • 4
  • 33
  • 34
  • why voted down , without testing code. leave some feedback as well while voting down. – Pramendra Gupta Sep 27 '10 at 16:10
  • 3
    Why is it attributed to Pascal Martin? Did you get the code from one of his answers? – alex Mar 13 '11 at 14:06
  • @Pascal I think copying and pasting other user's answers should be discouraged here. – alex Mar 13 '11 at 14:33
  • @alex seems to greatly based on a portion of the answer I gave there : stackoverflow.com/q/2574625/138475 *(I've had to delete/repost my comment to edit it a bit)* – Pascal MARTIN Mar 13 '11 at 14:33