0

I have the following content in a wordpress post

the_content() 

<div class="directions">
<div class="left"> left </div>
<div class ="right"> right </div>
</div>

Using DOMXpath I want to extract and pint only div class "left". I tried this

<?php libxml_use_internal_errors(true); ?>
<?php libxml_disable_entity_loader(true); ?>

<?php $html = new DOMDocument();?>
<?php $content = get_the_content();?>
<?php $html->loadHTML($content);?>
<?php $xpath = new DOMXpath($html); ?>
<?php $xpath->query('//div[contains(@class, "left")]'); ?>
<?php echo $xpath -> textContent;  ?>

Unfortunately, I get nothing returned. Does someone see my mistake?

Valentin
  • 61
  • 7

1 Answers1

0

You want to store the result of DOMXPath::query into a variable, which will be a DOMNodeList.

You can then access it like an array (although interestingly, the manual doesn't mention it):

$matches = $xpath->query('//div[contains(@class, "left")]');
echo $matches[0]->textContent;

(Note: ideally, you also want to check that your query returns at least one result.)

Demo: https://3v4l.org/M0TQX

Side note: you don't need to open and close PHP tags on every line :)


Edit

Actually using DOMXPath::evaluate makes this even easier (thanks to @ThW for the helpful comment below):

echo $xpath->query('string(//div[contains(@class, "left")])');
Jeto
  • 14,596
  • 2
  • 32
  • 46
  • thank you very much. However my text contains umlaute such as ä,ö,ü which aren't displayed correctly. Is there a way to solve this? – Valentin Dec 31 '19 at 00:23
  • @Valentin Try `$html->loadHTML(mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8'));` as suggested [here](https://stackoverflow.com/a/8218649/965834). I'm surprised there isn't a better way to do it, but that should work. Also updated the demo link. – Jeto Dec 31 '19 at 00:28
  • thank you, I did something similar using 'echo mb_convert_encoding($content->nodeValue, 'ISO-8859-1', 'auto');' One final question though: Can I somehow keep the div class in my output? – Valentin Dec 31 '19 at 01:05
  • @Valentin You can use `echo $matches[0]->ownerDocument->saveHTML($$matches[0]);` (where `$matches` is the result from `$xpath->query` like in my answer). Or if you want just the class: `echo $matches[0]->getAttribute('class');` – Jeto Dec 31 '19 at 01:49
  • 1
    @Jeto *btw* You can can get the text content directly with Xpath using `$xpath->evaluate('string(//div[contains(@class, "left")])');` – ThW Jan 03 '20 at 18:21
  • @ThW Right, very good point. I somehow always forget that method exists. Will edit the answer accordingly. – Jeto Jan 03 '20 at 18:29
  • However it only works with `DOMXpath::evaluate()` not with `DOMXpath::query()` – ThW Jan 03 '20 at 19:16
  • Yeah I meant method as in class method :) – Jeto Jan 03 '20 at 19:30