-1

I have the following div:

<div class="myclass"><strong><a rel="nofollow noopener" href="some link">dynamic content</a></strong></div>

and I want to get only the: dynamic content anchor text.

so far I have tried with preg_match_all:

"'<div class=\"myclass\">(.*?)</div>'si"

that returns all div content.

I tried to combine it with:

"|<a.*(?=href=\"([^\"]*)\")[^>]*>([^<]*)</a>|i"

that returns anchor text but I cannot make it to work

can someone help?

thank you

qwertyg
  • 127
  • 10
  • I think you should read this SO post: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags before you fall down into the "html and regex"-rabbit hole. – M. Eriksson Jun 28 '18 at 09:02

1 Answers1

2

You can use DOMDocument instead to preg_match_all

$html = '<div class="myclass"><strong><a rel="nofollow noopener" href="some link">dynamic content</a></strong></div>';

$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

$query = './/div[@class="myclass"]/strong/a';
$nodes = $xpath->query($query);

echo $nodes[0]->textContent;
Jagjeet Singh
  • 1,564
  • 1
  • 10
  • 23