-1

How to extract images from website and download them in a local file with simple-html-dom to load them from file to avoid loading images each time from their original web site.

include ('simple_html_dom.php');

$html = file_get_html('http://www.caradisiac.com/');
    foreach( $html->find('.featured img') as $image ){
        echo $image->src;
        echo "<br>";
    }

Help me please!!

  • Please post the code that you have put together so far -- Stack Overflow is for solving specific programming problems; it is not a coding service. Have you searched for similar problems? – i alarmed alien Sep 27 '14 at 12:07

1 Answers1

1

0 - Be sure you've read PHP manual to see all the amazing built-in functions PHP has.

1 - Build a local path for the image, you may use preg_replace to sanitize the URL

2 - Check the image hasn't already been downloaded using file_exists, if so, load it; else download it

3 - Use file_get_contents to retrieve the image (cURL would be uselessly heavier)

4 - Save it to a local file using file_put_contents

foreach( $html->find('.featured img') as $image )
{
    $imageSrc = $image->src;
    $imageUri = $this->rel2abs($imageSrc, $sourceURI);
    $imageLocalPath = 'getImages/'.preg_replace('/[^a-z0-9-.]/i', '-', $imageUri);

    if (!file_exists($imageLocalPath))
    {
        $imageData = file_get_contents($imageUri, false, $streamContext);
        file_put_contents($imageLocalPath, $imageData);
    }
    else
        $imageData = file_get_contents($imageLocalPath);
}

Notes:

  • You need rel2abs to resolve relative URIs, or any appropriate pecl extension.
  • getImages/ will put all images in a subfolder: you need to manually create that subfolder, or check its existence in the PHP code and create it if needed
  • $imageData contains the raw data of the image, you may which to use imagecreatefromstring to load the corresponding Gd image.
  • Take care: you're downloading stuff from a distant webpage, so you must trust it well. One could add a tag like <div class="featured"><img src="http://evil.com/your-heart-will-bleed.php"/></div> in the html page, and the evil php file will be downloaded. Worst, it may be executed by visiting your website http://mywebsite.com/getImages/your-heart-will-bleed.php.
Community
  • 1
  • 1
Xenos
  • 3,351
  • 2
  • 27
  • 50
  • Where do you get $sourceURI from? – Alejandro Oct 06 '16 at 03:05
  • `$sourceURI` is the web page URI you are calling (here, `http://www.caradisiac.com/`). It is required because `$image->src` can be relative (while here, you need an absolute URI for calling `file_get_contents`). – Xenos Oct 06 '16 at 07:33