0

Using preg_match_all is it possible to match elements within a parent that has a specific class name?

For example I have this HTML markup:

<div class="red lorem-ipsum>
  <a href="#">Some link</a>
</div>

<div class="red>
  <a href="#">Some link</a>
</div>

<div class="something red lorem-ipsum>
  <a href="#">Some link</a>
</div>

Can I match each <a> that's within a parent with class name red?

I tried this but it does not work:

~(?:<div class="red">|\G)\s*<a [^>]+>~i
CyberJ
  • 1,018
  • 1
  • 11
  • 24

1 Answers1

2

Use DOMDocument in combination with DOMXPath. Here the HTML is in the $html variable:

$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$matches = $xpath->query("//div[contains(concat(' ', @class, ' '), ' red ')]/a");
foreach($matches as $a) {
    echo $a->textContent;
}
trincot
  • 317,000
  • 35
  • 244
  • 286
  • Thank you for the code!I have not tried `DOMXPath` before. Where am I selecting the ``? is it the `/a` at the end? – CyberJ Feb 22 '19 at 20:28
  • Yes, indeed: `/` means "child" (`//` is "descendant"), and "a" is the tag name. The more complex part is to match a class. More about that part [here](https://stackoverflow.com/questions/1604471/how-can-i-find-an-element-by-css-class-with-xpath) – trincot Feb 22 '19 at 20:33