regexp to match all images without link

Question

I use this regex to match all images. How I can rewrite it to match all images WITHOUT </a> in the end ?

preg_match_all ("/\<img ([^>]*)\/*\>/i", $text, $dst);

You probably shouldn't be using regex to parse HTML, there are HTML parsers for that in PHP. — Qtax, May 21 '13 at 20:18
You mean you want to find all `img` that do not have `a` as a parent? Do you have some example input HTML? — Explosion Pills, May 21 '13 at 20:19

score 1 · Answer 1 · answered May 21 '13 at 20:55

soap box

I don't recommend using regex to parse an html string.

however

However you might want to try using DOM to first loop through all the images and store them in an array.

foreach ($dom->getElementsByTagName('img') as $img) {
    $array[$img->getAttribue('src')]=1;
}

Then loop through all links and try to find an image inside to remove from your array.

foreach ($dom->getElementsByTagName('a') as $a) {
    //loop to catch multiple IMGs in LINKS
    foreach ($a->getElementsByTagName('img') as $img) { 
        unset($array[$img->getAttribue('src')]);
    }
}

score 1 · Answer 2 · answered May 21 '13 at 20:56

You could use domDocument instead of a regex, the syntax here may not be right but it shoudl give you an idea.

$dom = new domDocument;
$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$images = $dom->getElementsByTagName('img');
$images_array = array();
foreach ($images as $image) {
  if ($image->parentNode->nodeName != 'a')
      echo $images_array = $image->getAttribute('src');
}

regexp to match all images without link

2 Answers2

soap box

however