0

I have an html file containing

 <img width="10" height="12" scr="https://www.site.com/yughggcfgh">
<img width="11" height="15" scr="https://www.site.com/yughggcfghcvbcvb">
<img width="10" height="12" scr="https://www.site.com/a.jpg">
<img width="10" height="12" scr="https://www.site.com/b.gif">

I want to extract the paths of images which doesn't have an extention in an array,
The output must be as follows

ari[1]= <img width="10" height="12" scr="https://www.site.com/yughggcfgh">
ari[2]= <img width="11" height="15" scr="https://www.site.com/yughggcfghcvbcvb"> 
Shakti Singh
  • 84,385
  • 21
  • 134
  • 153
Alfred Francis
  • 451
  • 1
  • 6
  • 20
  • possible duplicate of [How to parse and process HTML with PHP?](http://stackoverflow.com/questions/3577641/how-to-parse-and-process-html-with-php) – outis Apr 04 '12 at 11:58
  • I think you have a typo `scr=` → `src=` – knittl Apr 04 '12 at 12:01

2 Answers2

2

You really should use domDocument or some html parser not regex heres an example:

<?php 
$somesource='<img width="10" height="12" src="https://www.site.com/yughggcfgh">
<img width="11" height="15" src="https://www.site.com/yughggcfghcvbcvb">
<img width="10" height="12" src="https://www.site.com/a.jpg">
<img width="10" height="12" src="https://www.site.com/b.gif">';

$xml = new DOMDocument();
@$xml->loadHTML($somesource);
foreach($xml->getElementsByTagName('img') as $img) {
    if(substr($img->getAttribute('src'),-4,1)!='.'){
        $image[] = $img->getAttribute('src');
    }
}

print_r($image);

Array
(
    [0] => https://www.site.com/yughggcfgh
    [1] => https://www.site.com/yughggcfghcvbcvb
)

?>
Lawrence Cherone
  • 46,049
  • 7
  • 62
  • 106
1

Regular expressions are probably not the right tool for the job, but here you go …

You should be able to achieve your goal with negative lookbehind assertions:

preg_match_all('/src=".+?(?<!\.jpg|\.jpeg|\.gif|\.png)"/', $html, $matches);
knittl
  • 246,190
  • 53
  • 318
  • 364
  • error : Warning: preg_match() [function.preg-match]: Delimiter must not be alphanumeric or backslash in C:\xampp\htdocs\curl\index.php on line 19 – Alfred Francis Apr 04 '12 at 12:04