60

Using PHP, given a URL, how can I determine whether it is an image?

There is no context for the URL - it is just in the middle of a plain text file, or maybe just a string on its own.

I don't want high overhead (e.g. reading the content of the URL) as this could be called for many URLs on a page. Given this restriction, it isn't essential that all images are identified, but I would like a fairly good guess.

At the moment I am just looking at the file extension, but it feels like there should be a better way than this.

Here is what I currently have:

  function isImage( $url )
  {
    $pos = strrpos( $url, ".");
    if ($pos === false)
      return false;
    $ext = strtolower(trim(substr( $url, $pos)));
    $imgExts = array(".gif", ".jpg", ".jpeg", ".png", ".tiff", ".tif"); // this is far from complete but that's always going to be the case...
    if ( in_array($ext, $imgExts) )
      return true;
    return false;
  }

Edit: In case it's useful to anybody else here is the final function using the technique from Emil H's answer:

  function isImage($url)
  {
     $params = array('http' => array(
                  'method' => 'HEAD'
               ));
     $ctx = stream_context_create($params);
     $fp = @fopen($url, 'rb', false, $ctx);
     if (!$fp) 
        return false;  // Problem with url

    $meta = stream_get_meta_data($fp);
    if ($meta === false)
    {
        fclose($fp);
        return false;  // Problem reading data from url
    }

    $wrapper_data = $meta["wrapper_data"];
    if(is_array($wrapper_data)){
      foreach(array_keys($wrapper_data) as $hh){
          if (substr($wrapper_data[$hh], 0, 19) == "Content-Type: image") // strlen("Content-Type: image") == 19 
          {
            fclose($fp);
            return true;
          }
      }
    }

    fclose($fp);
    return false;
  }
danio
  • 8,548
  • 6
  • 47
  • 55

8 Answers8

29

You could use an HTTP HEAD request and check the content-type. This might be a good compromise. It can be done using PHP Streams. Wez Furlong has an article that shows how to use this approach to send post requests, but it can be easily adapted to send HEAD requests instead. You can retrieve the headers from an http response using stream_get_meta_data().

Of course this isn't really 100%. Some servers send incorrect headers. It will however handle cases where images are delivered through a script and the correct file extension isn't available. The only way to be really certain is to actually retrieve the image - either all of it, or the first few bytes, as suggested by thomasrutter.

Emil H
  • 39,840
  • 10
  • 78
  • 97
  • I would not describe this as bullet-proof. Browsers ignore the content-type for images encountered in – thomasrutter Mar 24 '09 at 11:27
  • Yes. I agree. I'll change the language. :) I do think that it's the best one can do without retrieving the actual content of the url, though. – Emil H Mar 24 '09 at 11:29
  • (downvote removed) yeah I think this is a decent option now :) – thomasrutter Mar 24 '09 at 11:36
  • 3
    I like the idea of checking the content type, but I will also consider your point that the content-type may not always be accurate. `$headers = get_headers($url, 1); $type = $headers["Content-Type"];` – SSH This May 24 '13 at 15:44
16
if(is_array(getimagesize($urlImg)))
    echo 'Yes it is an image!';
Liam Hammett
  • 1,631
  • 14
  • 18
Pedro Soares
  • 623
  • 2
  • 10
  • 15
14

In addition to Emil H's answer:

Using get_headers() to check the content type of an url without downloading the entire file with getimagesize()

    $url_headers=get_headers($url, 1);

    if(isset($url_headers['Content-Type'])){

        $type=strtolower($url_headers['Content-Type']);

        $valid_image_type=array();
        $valid_image_type['image/png']='';
        $valid_image_type['image/jpg']='';
        $valid_image_type['image/jpeg']='';
        $valid_image_type['image/jpe']='';
        $valid_image_type['image/gif']='';
        $valid_image_type['image/tif']='';
        $valid_image_type['image/tiff']='';
        $valid_image_type['image/svg']='';
        $valid_image_type['image/ico']='';
        $valid_image_type['image/icon']='';
        $valid_image_type['image/x-icon']='';

        if(isset($valid_image_type[$type])){

            //do something

        }
    }
RafaSashi
  • 16,483
  • 8
  • 84
  • 94
  • 1
    `$url_headers['Content-Type']` could very well be an array, and should be checked before assigning `$type`. Also, take care to keep track of the order you read the array (if only interested in final content-type, after all redirects, then you'll want the last item in the array) – Birrel Aug 24 '16 at 04:45
14

There are a few different approaches.

  • Sniff the content by looking for a magic number at the start of the file. For example, GIF uses GIF87 or GIF89 as the first five bytes of the file (in ascii). Unfortunately this can't tell you if there's an error in the image or if the image contains malicious content. Here are some magic numbers for various types of image files (feel free to use these):

    "\xff\xd8\xff" => 'image/jpeg',
    "\x89PNG\x0d\x0a\x1a\x0a" => 'image/png',
    "II*\x00" => 'image/tiff',
    "MM\x00*" => 'image/tiff',
    "\x00\x00\x01\x00" => 'image/ico',
    "\x00\x00\x02\x00" => 'image/ico',
    "GIF89a" => 'image/gif',
    "GIF87a" => 'image/gif',
    "BM" => 'image/bmp',
    

    Sniffing the content like this is probably going to fit your requirements best; you'll only have to read and therefore download the first few bytes of the file (past the header).

  • Load the image using the GD library to see if it loads without error. This can tell you if the image is valid, without error or not. Unfortunately this probably doesn't fit your requirements because it requires downloading the complete image.

  • If you really don't want to make an HTTP request for the image at all, then this rules out both sniffing and getting HTTP headers. You can, however, try to determine whether something is an image by the context in which it is linked. Something linked using a src attribute in an <img element is almost certainly an image (or an attempt at XSS, but that's another story). This will tell you if something is intended as an image. It won't tell you whether the image is actually available, or valid; you'll have to fetch at least the first small part (header or magic number) of the image URL to find that.

Unfortunately, it is possible for a file to be both a valid image as well as a ZIP file containing harmful content which could be executed as Java by a harmful site - see the GIFAR exploit. You can almost certainly prevent this vulnerability by loading the image in a library like GD and performing some non-trivial filter on it, like softening or sharpening it a tiny amount (ie using a convolution filter) and saving it to a fresh file without transferring any metadata across.

Trying to determine if something is an image by its content-type alone is quite unreliable, almost as unreliable as checking the file extension. When loading an image using an <img element, browsers sniff for a magic string.

thomasrutter
  • 114,488
  • 30
  • 148
  • 167
  • Thanks for the detailed answer but for my application reading potentially hundreds of images before displaying a page will probably be too much overhead. – danio Mar 24 '09 at 11:55
  • Did you read my third dot point? Not sure if you're getting images from – thomasrutter Mar 24 '09 at 12:13
  • Ah - just saw you're getting the image URLs from a text file. Some sort of light HTTP request, like my first dot point, would probably be necessary. Note that it's not significantly more overhead than the HTTP HEAD method. – thomasrutter Mar 24 '09 at 12:15
  • Take your point that there won't be much more overhead for reading the beginning of the image, but it would involve more coding, and content-type should be good enough. Much better than my extension checking anyway! – danio Mar 24 '09 at 12:25
8

Edit: For static images with popular image extension.

<?php
$imgExts = array("gif", "jpg", "jpeg", "png", "tiff", "tif");
$url ='path/to/image.png';
$urlExt = pathinfo($url, PATHINFO_EXTENSION);
if (in_array($urlExt, $imgExts)) {
    echo 'Yes, '.$url.' is an Image';
}

?>
TheMonkeyKing
  • 338
  • 2
  • 9
  • 1
    This is an image https://contulmeu.moldcell.md/ps/selfcare_uni/crypt/cryptographp.inc.php but this url will not pass your validation! You are wrong. – Jekis Jun 12 '12 at 10:34
  • As Jenechka points out this technique is very limited: 1. it assumes all image files have valid extensons; 2. it has no support for image formats the author hasn't thought about/are created after the program is written – danio Jul 06 '12 at 08:18
  • this is not perfect solution. but i like it, because it is simplicity – Hoàng Vũ Tgtt Jun 11 '21 at 11:12
4

Similar to some given answer but with a slightly different logic.

$headers = @get_headers($url, 1); // @ to suppress errors. Remove when debugging.
if (isset($headers['Content-Type'])) {
  if (strpos($headers['Content-Type'], 'image/') === FALSE) {
    // Not a regular image (including a 404).
  }
  else {
    // It's an image!
  }
}
else {
  // No 'Content-Type' returned.
}

@ is an error control operator.

Note we used the "strict" operator === FALSE in the condition because strpos($headers['Content-Type'], 'image/') does return 0 in our use case if the needle is found in the haystack. With type casting using == that would erroneously be interpreted as FALSE.

Martin Postma
  • 91
  • 1
  • 14
1

we can use exif_imagetype to check the image type, so it's not allow to any other content types. It only allow images and we can restrict them to few image types, following sample code show how to allow GIF image type.

if (exif_imagetype('image.gif') != IMAGETYPE_GIF) {
    echo 'The picture is not a gif';
}

You can use following image types,

 IMAGETYPE_GIF
 IMAGETYPE_JPEG
 IMAGETYPE_PNG
 IMAGETYPE_SWF
 IMAGETYPE_PSD
 IMAGETYPE_BMP
 IMAGETYPE_TIFF_II (intel byte order)
 IMAGETYPE_TIFF_MM (motorola byte order)
 IMAGETYPE_JPC
 IMAGETYPE_JP2
 IMAGETYPE_JPX
 IMAGETYPE_JB2
 IMAGETYPE_SWC
 IMAGETYPE_IFF
 IMAGETYPE_WBMP
 IMAGETYPE_XBM
 IMAGETYPE_ICO

more details : link

Janith Chinthana
  • 3,792
  • 2
  • 27
  • 54
-1

Fast Solution for broken or not found images link
i recommend you that don't use getimagesize() because it will 1st download image then it will check images size+if this will not image then it will throw exception so use below code

if(checkRemoteFile($imgurl))
{
//found url, its mean
echo "this is image";
}

function checkRemoteFile($url)
{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,$url);
    // don't download content
    curl_setopt($ch, CURLOPT_NOBODY, 1);
    curl_setopt($ch, CURLOPT_FAILONERROR, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    if(curl_exec($ch)!==FALSE)
    {
        return true;
    }
    else
    {
        return false;
    }
}

Note: this current code help you to identify broken or not found url image this will not help you to identify image type or headers

Hassan Saeed
  • 6,326
  • 1
  • 39
  • 37
  • 2
    This will only tell you if the URL exists so it is misleading to even say that it will find "image" because it will find "anything" . – Mike Q Mar 26 '18 at 18:53