39

How could I extract the string "text" from this markup using the PHP DOM?

<div><span>notthis</span>text</div>

$div->nodeValue includes "notthis"

Ben G
  • 26,091
  • 34
  • 103
  • 170

2 Answers2

41

You can access DOMText node directly using XPath:

$xpath = new DOMXPath($dom_document);
$node = $xpath->query('//div/text()')->item(0);
echo $node->textContent; // text
netcoder
  • 66,435
  • 19
  • 125
  • 142
27

So long as you can affect the DOM, you could remove that span.

$span = $div->getElementsByTagName('span')->item(0);
$div->removeChild($span);

$nodeValue = $div->nodeValue;

Alternatively, just access the text node of $div.

foreach($div->childNodes as $node) {

    if ($node->nodeType != XML_TEXT_NODE) {
        continue;
    }
    $nodeValue = $node;
}

If you end up with more text nodes and only want the first, you can break after the first assignment of $nodeValue.

alex
  • 479,566
  • 201
  • 878
  • 984