4

i want a regex pattern to remove images which src attribute is empty, for example :

$html = '<img src="adasas.jpg" /><br />asasas<br />sdfsdf<br /><img title="asa" src="" />';

or

$html = '<img src="adasas.jpg" /><br />asasas<br />sdfsdf<br /><a href="adafgag"><img title="asa" src="" /></a>';

if this <img exist between <a> tag, i want also remove all ( <a and <img ) .

I Tested below code, but it removed all of $html

echo preg_replace( '!(<a([^>]+)>)?<img(.*?)src=""([^>]+)>(</a>)?!si' , '' , $html );

Can anybody help to me ?

thanks in advance

DJafari
  • 12,955
  • 8
  • 43
  • 65
  • 3
    It is not recommended to process html with regex. See [this](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) question. – Ikke May 26 '11 at 08:40
  • @Ikke: I was about to link to the same question... – Boldewyn May 26 '11 at 08:42
  • that's true, but i want this for my portal to remove fake images, i think for this, XML Parser is not needed, because i only want this ! can you help me to solve this by regex ? – DJafari May 26 '11 at 08:43
  • @Boldwyn: It is THE question that gets linked, when a HTML+RegEx question is posted. LOL :D – Ranhiru Jude Cooray May 26 '11 at 08:44
  • The problem is it's very hard to get it right, and chance is that you miss an edge case and your regex fails. – Ikke May 26 '11 at 09:45

1 Answers1

4

Your problem is likely that the generic .*? matched too much. Rather use [^>]* like in the other parts of the pattern:

'!(<a\s[^>]+>)?<img([^>]+)src=""([^>]*)>(</a>)?!i'
mario
  • 144,265
  • 20
  • 237
  • 291