2

if i have a string

<div> balah balah <img src='image/www.png' /> balah balah</div>
<div> balah balah <img src='image/ttt.png' /> balah balah</div>
<div> balah balah <img src='image/rrr.png' /> balah balah</div>

how could i found image name which is in src. I use this code

 $pos = strpos($srt,".png");

for find position of .png and i got position.

I found the first one ".png" but no any way found to traverse from ".png" to "/" back.

How could i found the name in between of "/" and "." which is "www".

Little bit confusion.

Updated Question: actual problem

suppose i got HTML from URL via PHP with help of cURL().

how could i retrieve all images names and store in a folder.

Naresh
  • 2,761
  • 10
  • 45
  • 78

4 Answers4

6

you can use something like this to get the source of images:

<?php
    $doc = new DOMDocument();
    $doc->loadHTML(htmlstring);
    $imageTags = $doc->getElementsByTagName('img');

    foreach($imageTags as $tag) {
        echo $tag->getAttribute('src');
    }
?>
Luis Tellez
  • 2,785
  • 1
  • 20
  • 28
1

You should use preg_match_all for such a task. Not tested:

preg_match_all('/image\/(.*)\.png/iU', $str, $matches);

var_dump($matches);

$matches should now contain www, ttt, rrr.

  • 1
    Never ever try to parse HTML with regular expressions. HTML is not a regular language and can't be parsed with regexes. Use a proper HTML/XML parser instead. For details, please refer to http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Holger Just Feb 22 '13 at 09:50
1
$text = "
<div> balah balah <img src='image/www.png' /> balah balah</div>
<div> balah balah <img src='image/ttt.png' /> balah balah</div>
<div> balah balah <img src='image/rrr.png' /> balah balah</div>
";
preg_match_all("/src='image\/([^.]+)/i", $text, $out);
/*
echo $out[1][0]; //www
echo $out[1][1]; //ttt
echo $out[1][2]; //rrr
*/
print_r($out);

OUTPUT
Array
(
    [0] => Array
        (
            [0] => src='image/www
            [1] => src='image/ttt
            [2] => src='image/rrr
        )

    [1] => Array
        (
            [0] => www
            [1] => ttt
            [2] => rrr
        )

)
0

I wrote a script with the help of all of you. Hope this will help many solution seekers same as my problem.

  <?php
        $url='http://php.net/'; 
        $returned_content = get_url_contents($url); 

        /* gets the data from a URL */

        function get_url_contents($url){
                $crl = curl_init();
                $timeout = 5;
                curl_setopt ($crl, CURLOPT_URL,$url);
                curl_setopt ($crl, CURLOPT_RETURNTRANSFER, 1);
                curl_setopt ($crl, CURLOPT_CONNECTTIMEOUT, $timeout);
                $ret = curl_exec($crl);
                curl_close($crl);
                return $ret;
        }

        $doc = new DOMDocument();
        $doc->loadHTML($returned_content);
        $imageTags = $doc->getElementsByTagName('img');
        $img1 = array();
        foreach($imageTags as $tag) {
            $img1[] = $tag->getAttribute('src');

        }

        foreach($img1 as $i){
            save_image($i);
            if(getimagesize(basename($i))){
                echo '<h3 style="color: green;">Image ' . basename($i) . ' Downloaded OK</h3>';
            }else{
                echo '<h3 style="color: red;">Image ' . basename($i) . ' Download Failed</h3>';
            }
        }

        //Alternative Image Saving Using cURL seeing as allow_url_fopen is disabled - bummer
        function save_image($img1,$fullpath='http://example.com/'){
            if($fullpath=='http://example.com/'){
                $fullpath = basename($img1);
            }
            $ch = curl_init ($img1);
            curl_setopt($ch, CURLOPT_HEADER, 0);
            curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
            curl_setopt($ch, CURLOPT_BINARYTRANSFER,1);
            $rawdata=curl_exec($ch);
            curl_close ($ch);
            if(file_exists($fullpath)){
                unlink($fullpath);
            }
            $fp = fopen($fullpath,'x');
            fwrite($fp, $rawdata);
            fclose($fp);
        }
    ?>
Naresh
  • 2,761
  • 10
  • 45
  • 78