1

I have a php code that will extract and retrieve all the images in a website. How do I modify the code so that the dimensions(width and height) of the images are shown as well?

This is the php coding:

<?php
$page_title = "MiniCrawler";
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
    <title><?php print($page_title) ?></title>
</head>
<body>

    <?php 
    ini_set('display errors',1);  
    error_reporting(E_ALL|E_STRICT);
    Include simple_html_dom.php.   
    include_once ('simple_html_dom.php');

// Add the url of the site you want to scrape. 
    $target_url = "http://www.alibaba.com/";

// Let simple_html_dom do its magic:
    $html = new simple_html_dom();
    $html->load_file($target_url);

// Loop through the page and find everything in the HTML that begins with 'img'
    foreach($html->find('img') as $link){
        echo $link->src."<br />";
        echo '<img src ="'. $link->src.'"><br />';
    }

    ?>
</body>
</html>

Thanks

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Cael
  • 556
  • 1
  • 10
  • 36

1 Answers1

1

First you would have to check, if the $link->src string already has the domain name at the beginning:

<?php

  if(substr($link->src, 0, 4) == "http"){
    // url already complete
    $path = $link->src;
  }else if(substr($link->src, 0, 1) == "/"){
    // path starts absolute
    $path = $target_url . $link->src;
  }else{
    // path starts relative -> http://stackoverflow.com/questions/4444475/transfrom-relative-path-into-absolute-url-using-php
  }

?>

Then: Request the files dimensions via the getimagesize() function.

<?php

  list($width, $height, $type, $attr) = getimagesize($path);
  echo '<img src ="'. $link->src.'" width="' . $width . '" height="' . $height . '"><br />';

?>
Patrick G
  • 486
  • 2
  • 12