1

I use this regex to match all images. How I can rewrite it to match all images WITHOUT </a> in the end ?

preg_match_all ("/\<img ([^>]*)\/*\>/i", $text, $dst);  
Qtax
  • 33,241
  • 9
  • 83
  • 121
Dikobraz
  • 649
  • 2
  • 8
  • 22

2 Answers2

1

soap box

I don't recommend using regex to parse an html string.

however

However you might want to try using DOM to first loop through all the images and store them in an array.

foreach ($dom->getElementsByTagName('img') as $img) {
    $array[$img->getAttribue('src')]=1;
}

Then loop through all links and try to find an image inside to remove from your array.

foreach ($dom->getElementsByTagName('a') as $a) {
    //loop to catch multiple IMGs in LINKS
    foreach ($a->getElementsByTagName('img') as $img) { 
        unset($array[$img->getAttribue('src')]);
    }
}
Ro Yo Mi
  • 14,790
  • 5
  • 35
  • 43
1

You could use domDocument instead of a regex, the syntax here may not be right but it shoudl give you an idea.

$dom = new domDocument;
$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$images = $dom->getElementsByTagName('img');
$images_array = array();
foreach ($images as $image) {
  if ($image->parentNode->nodeName != 'a')
      echo $images_array = $image->getAttribute('src');
}
dave
  • 62,300
  • 5
  • 72
  • 93