62

I am generating dynamic URLs of images for book ISBNs. I need a reliable way with PHP to check whether the images actually exist at the remote url. I tried various approaches with different PHP libraries, curl, etc., but none of them works well, some of them are downright slow. Given the fact that I need to generate (and check!) about 60 URLS for each book in my database, this is a huge waiting time. Any clues?

Cristian Cotovan
  • 1,090
  • 1
  • 13
  • 23
  • 2
    Dupe? http://stackoverflow.com/questions/981954/how-can-one-check-to-see-if-a-remote-file-exists-using-php – Mark Biek Sep 01 '09 at 18:36

10 Answers10

118
function checkRemoteFile($url)
{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,$url);
    // don't download content
    curl_setopt($ch, CURLOPT_NOBODY, 1);
    curl_setopt($ch, CURLOPT_FAILONERROR, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

    $result = curl_exec($ch);
    curl_close($ch);
    if($result !== FALSE)
    {
        return true;
    }
    else
    {
        return false;
    }
}

--> that is the fastest way if your host supports curl

SirDerpington
  • 11,260
  • 4
  • 49
  • 55
dangkhoaweb
  • 1,189
  • 1
  • 7
  • 2
  • 2
    I think this is the best answer to the question. Thanks to the CURLOPT_NOBODY option being set to true, this will be very fast. – markus Apr 27 '12 at 14:47
  • 10
    For me this function always returns true, even if I provide wrong URL of image. – Devnegikec Aug 06 '13 at 06:03
  • 1
    In my case (expedia, 10+ images in a loop), "true==file_get_contents($imgurl,0,null,0,1)" was faster than curl – Jeffz Mar 22 '14 at 21:58
  • 5
    I tested all mentioned methods several times on 44 YouTube thumb URLs. Result: curl: 1.8s, file_get_contents: 2.6s, getimagesize: 4.5s, imagecreatefromjpeg: 4.7s – Martin Nov 19 '15 at 19:07
  • 12
    Don't forget to `curl_close($ch);` at the end – toesslab Jan 13 '16 at 09:50
  • @Devnegikec and anyone else it's not working for - you may need to add one/both of these lines: `curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);` – Pamela Jan 31 '16 at 11:43
  • @Daenu Since they are returning from the function the `curl_close` would have to be before the two `return`s to actually do anything in the above code. – Tim Ramsey Apr 05 '16 at 13:59
  • 1
    I know this answer is from 2011, but I used it in a project I'm working on and if you don't include the `curl_close($ch)` either at the end or after each return then it will cause your application to hang. It did in my case. – mickburkejnr Jan 03 '18 at 13:18
  • I suggest to check the http response code, because many sites handle missing url with their own page so this function return true even when the page doesn't exists, but has custom 404 page. after `curl_exec($ch);` use this function `$info = curl_getinfo($ch);` and then you can check in if it returns OK (200 status code) `if ($result === TRUE || $info['http_code'] === 200)` – PHPisMyPassion May 28 '21 at 08:36
  • 1
    @NabilKadimi using `return curl_exec( $ch ) !== FALSE;` won't give chance to close the connection. If the connection is closed before adding this, then it'll throw an error – MichaelTheDev Sep 22 '22 at 09:10
66

Use getimagesize() method like this

$external_link = ‘http://www.example.com/example.jpg’;
if (@getimagesize($external_link)) {
echo  “image exists “;
} else {
echo  “image does not exist “;
}
mohsin139
  • 967
  • 7
  • 6
  • 16
    Straight from http://php.net/manual/en/function.getimagesize.php - "Do not use getimagesize() to check that a given file is a valid image. Use a purpose-built solution such as the Fileinfo extension instead." This method downloads the file and is inferior to any method that can just check via the HEAD – Tim Ramsey Mar 30 '16 at 17:43
  • 2
    To @TimRamsey's comment, the problem comes form non-image files, so you validate the file type is an image you should be ok. Elegant solution btw. – Praesagus Oct 06 '16 at 22:09
  • 4
    simple solution but very slow – Elyor Apr 20 '17 at 05:42
  • @Praesagus - While you may be "OK", this solution is still inferior(SLOW) compared to just fetching the HEAD in a request. The answer by dangkhoaweb shows how this can be done. – Tim Ramsey Aug 25 '17 at 17:15
6

There is no "easy" way here - at a very minimum, you need to generate a HEAD request and check the resulting content type to make sure it's an image. That's not taking into account possible referrer issues. curl is the way to go here.

ChssPly76
  • 99,456
  • 24
  • 206
  • 195
4

I have been doing this for my real estate picture tracking...

$im = @imagecreatefromjpeg($pathtoimg);
if($im)
  imagedestroy($im); // dont save, just ack...
elseif(!$missing[$inum])
  $img404arr[] = $inum;

It 'seems' faster than downloading the actual image, taking about .3 seconds for each from images that avg 100k.

I wish I could just do a header check and read whether I get a 200 vs a 404 without downloading anything. Anyone have that handy?

sth
  • 222,467
  • 53
  • 283
  • 367
Andrew Deal
  • 91
  • 1
  • 1
4

You could use curl. Just set the curl option CURLOPT_NOBODY to true. This will skip body information and only get the head (thus http code as well). Then, you could use the CURLOPT_FAILONERROR to turn this whole process into a true/false type check

Kevin Peno
  • 9,107
  • 1
  • 33
  • 56
3

You can use getimagesize()

Credit: http://junal.wordpress.com/2008/07/22/checking-if-an-image-url-exist/

timborden
  • 1,470
  • 4
  • 18
  • 24
1

It's probably moot at this point, but this works for me:

function is_webfile($webfile)
{
 $fp = @fopen($webfile, "r");
 if ($fp !== false)
  fclose($fp);

 return($fp);
}
isherwood
  • 58,414
  • 16
  • 114
  • 157
ChronoFish
  • 3,589
  • 5
  • 27
  • 38
  • Try to avoid the at-operator. – viam0Zah Jul 06 '10 at 12:43
  • 2
    @TörökGábor definitely a good suggestion, however not on point - manual: `If the open fails, an error of level E_WARNING is generated. You may use @ to suppress this warning.` – jave.web Apr 13 '17 at 00:21
0

Solution from https://www.experts-exchange.com

<?php
function url_exists($url) {
    if (!$fp = curl_init($url)) return false;
    return true;
}
?>
Gr Brainstorm
  • 149
  • 2
  • 2
0

I am using this and modified from all other answers by myself. Exit or not exit image but we also have to handle error which is missing in other answers.

Here is what I am using :

function checkRemoteFile($url)
{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_NOBODY, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects

    // Set a maximum timeout of 10 seconds to prevent the script from hanging
    curl_setopt($ch, CURLOPT_TIMEOUT, 10); 

    // Execute the request and get the HTTP status code
    curl_exec($ch);
    $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);

    curl_close($ch);

    // Check the HTTP status code
    if($httpCode >= 200 && $httpCode < 300) {
        // The file exists and the server returned a successful HTTP status code
        return true;
    } else {
        // The file does not exist or the server returned an error HTTP status code
        return false;
    }
}
Eric Aya
  • 69,473
  • 35
  • 181
  • 253
-1

If the images all exist on the same remote server (or in the same network), you could run a web service on that server that will check the file system for the the image file and return a bool value indicating wheter the image exists or not.

Jaimal Chohan
  • 8,530
  • 6
  • 43
  • 64