0

I have the following string repeated many times (about 10 times)

<br />
<a href="http://www.someurl.com/someimage.jpg" target="_blank">SOME TEXT</a>

Now I want to match that that piece of code and basicly strip it completly out of my string. The catch is that the image URL and the 'SOME TEXT' will always be different and I need to repeat this only for the first 3 instances of this combo (including the line break-br) in the string.

nickb
  • 59,313
  • 13
  • 108
  • 143
Mark
  • 3,653
  • 10
  • 30
  • 62
  • 1
    Use [dom document](http://php.net/manual/en/class.domdocument.php) – Esailija Jun 24 '12 at 13:39
  • [check this out](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) – Adi Jun 24 '12 at 13:40
  • Or use [phpquery](http://code.google.com/p/phpquery/) (jQuery port to PHP) – German Rumm Jun 24 '12 at 13:49
  • [Regexes can't reliably parse html](http://stackoverflow.com/questions/701166/can-you-provide-some-examples-of-why-it-is-hard-to-parse-xml-and-html-with-a-reg/702222#702222), you need a parser for that. A mandatory read should be [bobince's answer](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). – Lieven Keersmaekers Jun 24 '12 at 14:13

1 Answers1

2

It is bad idea to parse HTML with regex, but if you want to do it anyway, then use:

PHP code:

preg_replace('/<br\s*\/>\s*<a href="[^"]*" target="_blank">[^<]*<\/a>/', '', $str, 3);
Ωmega
  • 42,614
  • 34
  • 134
  • 203
  • Here's the PHP Equivalent of your line of PERL: `$html=preg_replace('@
    \s*[^<]*@','',$html,3);`. This would replace the first 3 instances of the pattern you posted (which should work given OP's explanation)
    – Sean Johnson Jun 24 '12 at 14:08