0

I have the following code:

<p>&nbsp;<img src="spas01.jpg" alt="" width="630" height="480"></p>
<p style="text-align: right;"><a href="spas.html">Spas</a></p>
<p>My Site Content [...]</p>

I need a regular expression to get only the "My Site Content [...]". So, i need to ignore first image (and maybe other) and links.

Andre Felipe
  • 169
  • 1
  • 7
  • What have you tried so far? And just for the regex html part: [see this](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) i love linking this one :P – Tobias Golbs Aug 25 '15 at 13:22
  • Summarizing I've tried: substr(strip_tags($content), 0, 80); --- The image was excluded, but the link still there. – Andre Felipe Aug 25 '15 at 13:25

2 Answers2

1

Try This:
Use (?<=<p>)([^><]+)(?=</p>) or <p>\K([^><]+)(?=</p>)

Update

   $re = "@<p>\\K([^><]+)(?=</p>)@m"; 
$str = "<p>&nbsp;<img src=\"spas01.jpg\" alt=\"\" width=\"630\" height=\"480\"></p>\n<p style=\"text-align: right;\"><a href=\"spas.html\">Spas</a></p>\n<p>My Site Content [...]</p>"; 

preg_match_all($re, $str, $matches);

Demo

Ahosan Karim Asik
  • 3,219
  • 1
  • 18
  • 27
0

With DOMDocument and DOMXPath:

$html = <<<'EOD'
<p>&nbsp;<img src="spas01.jpg" alt="" width="630" height="480"></p>
<p style="text-align: right;"><a href="spas.html">Spas</a></p>
<p>My Site Content [...]</p>
EOD;

$dom = new DOMDocument;
$dom->loadHTML($html);

$xp = new DOMXPath($dom);
$query = '//p//text()[not(ancestor::a)]';

$textNodes = $xp->query($query);

foreach ($textNodes as $textNode) {
    echo $textNode->nodeValue . PHP_EOL;
}
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125