2

How can i match all images starting with pics.domain.com?

what i've tried

preg_match_all('/<img .*src=(pics.domain.com*)["|\']([^"|\']+)/i', $row['story'], $matches);
rabotalius
  • 1,430
  • 5
  • 18
  • 27
  • 2
    For the zillion time, please use a HTML parser to work on HTML. Use DOMDocument to pick out the `img` tags, and then use regex to check the `src` attribute. – nhahtdh Apr 10 '13 at 02:40
  • @nhahtdh - it does look like it's coming from a DB, but still good point. – adeneo Apr 10 '13 at 02:42

2 Answers2

4

Use DOMDocument and simply iterate over each <img> tag; then use parse_url() to find the host of each image path:

$doc = new DOMDocument;
libxml_use_internal_errors(true);
$doc->loadHTML($row['story']);
libxml_clear_errors();

foreach ($doc->getElementsByTagName('img') as $img) {
    if (parse_url($img->getAttribute('src'), PHP_URL_HOST) === 'pics.domain.com') {
        echo "Yay, image found\n";
    }
}
Ja͢ck
  • 170,779
  • 38
  • 263
  • 309
  • 2
    +1, was actually typing the same thing based on this [answer](http://stackoverflow.com/questions/2120779/regex-php-isolate-src-attribute-from-img-tag), but you beat me to it! – adeneo Apr 10 '13 at 02:50
2

I've used the regex in the past, it works outside of <img> tags as well.

'@[\'"](https?://)?([^\.][^\'"]*?)(/)?([^\'"/]*?)\.(jpg|jpeg|png|gif|bmp)[\'"]@'

A more specific version:

'@[\'"](https?://)?pics\.domain\.com[^\'"]*?\.(jpg|jpeg|png|gif|bmp)[\'"]@'

In English:

[start quote](http or https or neither)pics.domain.com(anything that isn't a quote)(some image extension)[end quote]

Michael Benjamin
  • 2,895
  • 1
  • 16
  • 18