42

I am trying to get image size (image dimensions, width and height) of hundreds of remote images and getimagesize is way too slow.

I have done some reading and found out the quickest way would be to use file_get_contents to read a certain amount of bytes from the images and examining the size within the binary data.

Anyone attempted this before? How would I examine different formats? Anyone has seen any library for this?

Alfred Bez
  • 1,191
  • 1
  • 14
  • 31
Sir Lojik
  • 1,409
  • 7
  • 24
  • 45
  • 4
    It's probably slow because the images are _remote_. Download them first, and `getimagesize()` will be blazing fast. After all, it only reads certain binary bytes from the images. – kijin Jan 08 '11 at 20:24
  • thats why i want to use file_get_contents to 1) Not download the whole file. 2)read only certain bytes to get image size – Sir Lojik Jan 08 '11 at 20:27
  • actually i cud use fopen and fgets to read just those binary bytes – Sir Lojik Jan 08 '11 at 20:32
  • Do the remote sites not give a `Content-Length` header? – salathe Jan 08 '11 at 20:39
  • @salathe, im more interested in getting image dimensions from binary data – Sir Lojik Jan 08 '11 at 20:52
  • 1
    I would be suprised if getimagesize() downloaded significantly more of the file than required. – goat Jan 08 '11 at 20:54

3 Answers3

95
function ranger($url){
    $headers = array(
    "Range: bytes=0-32768"
    );

    $curl = curl_init($url);
    curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    $data = curl_exec($curl);
    curl_close($curl);
    return $data;
}

$start = microtime(true);

$url = "http://news.softpedia.com/images/news2/Debian-Turns-15-2.jpeg";

$raw = ranger($url);
$im = imagecreatefromstring($raw);

$width = imagesx($im);
$height = imagesy($im);

$stop = round(microtime(true) - $start, 5);

echo $width." x ".$height." ({$stop}s)";

test...

640 x 480 (0.20859s)

Loading 32kb of data worked for me.

Dejan Marjanović
  • 19,244
  • 7
  • 52
  • 66
  • just read this article and it explains the different and best methods of retrieving the file http://www.php-mysql-tutorial.com/wikis/php-tutorial/reading-a-remote-file-using-php.aspx . Step 2. How to differentiate the different binary code pulled in to get image size – Sir Lojik Jan 08 '11 at 20:38
  • 16
    I wouldn't trust php tutorial written in aspx :P I updated my answer, you should have everything you need. If that's not it then sorry, I don't entirely understand your question. If you want to compare it binary, you would have to load it entirely into a string. – Dejan Marjanović Jan 08 '11 at 20:45
  • @webarto I think he wants the image dimensions, which will be in the image metadata not the file metadata. He'll have to download the initial bytes of the file, but just how many depend on image format. – moinudin Jan 08 '11 at 20:45
  • Btw i should have specified, im looking for image dimensions. lemme edit – Sir Lojik Jan 08 '11 at 20:46
  • Ok, that explains everything. Check this link, http://regex.info/exif.cgi?url=http://g.imagehost.org/0861/krs.jpg, scroll down, you will see XMP loaded in ~ 3.5kb, load a couple of your images, and see how many bytes you would have to load, but I don't know will this work if file is "broken". – Dejan Marjanović Jan 08 '11 at 20:55
  • 2
    With the range parameter I had problems with larger PNGs. The best way I my case is to enhance the API to not only get the image URL but also the size or orientation. – ownking Jan 16 '12 at 10:02
  • yey its for awesome answer – NullPoiиteя Mar 28 '13 at 13:54
  • this also helps when you want 'allow_url_fopen' off still. – circlecube May 22 '13 at 19:39
  • It may rise a **Notice**: Premature end of JPEG file - just be aware of that - it can be solved pretty easily - by `@` => error suppression like `@imagecreatefromstring($raw)` :) – jave.web Jan 22 '14 at 13:31
  • getting this error " imagecreatefromstring(): gd-jpeg, libjpeg: recoverable error: Premature end of JPEG file in /path_of_ile on line 30" – s4suryapal Sep 26 '14 at 07:24
  • How simple would it be to make this work with locally hosted images as well? Mine fails when it isn't a remote image. But @s4suryapal just put an @ at the beginning of the imagecreatefromstring() call to suppress that error. No way around it if you're chopping the end of the file off. – SISYN Aug 09 '15 at 00:20
  • Here is the image that causes it to break on my server (the script is being run from within merkd.com) http://merkd.com/usr/photos/1438385719.2012-02-09-114643.1.jpg – SISYN Aug 09 '15 at 00:29
  • This trick is currently buggy in the most recent versions of PHP 5.5 and 5.6: https://bugs.php.net/bug.php?id=70315... It will break your site and you won't know about it until Google reports all your pages are returning a 500 server error (even if they look just fine to you). – Manuel Razzari Aug 28 '15 at 16:18
  • it does not work with images in https. for example http://martelmaides.agencypilot.com/store/property/218+11.jpg – Display name Jun 09 '16 at 11:42
  • To override HTTPS verification with cURL : curl_setopt($request, CURLOPT_SSL_VERIFYHOST, false); curl_setopt($request, CURLOPT_SSL_VERIFYPEER, false); – pyrsmk Jan 25 '17 at 11:17
  • DanL: Even if you use a full URL to a local source, Apache will treat it as local because it will look up the DNS and see that it is hosted locally. – Exit Mar 10 '17 at 14:50
  • In my testing, `ranger` was slower than the native `getimagesize` when accessing local files. See my answer below for results. – Exit Mar 10 '17 at 14:58
  • Maybe it's too late for me to ask, but how do we get the image type (bmp, jpeg, gif, png, etc.) by using this method? – Rizki Pratama Mar 15 '17 at 06:43
  • From my testing, this method only reliably works with JPGs. All other image formats failed. – Paul Sheldrake Jul 10 '20 at 13:14
27

I have created a PHP library for exactly this scenario, it works by downloading the absolute minimum of the remote file needed to determine the filesize. This is different for every image and particularly for JPEG depends on how many embedded thumbnails there are in the file.

It is available on GitHub here: https://github.com/tommoor/fastimage

Example usage:

$image = new FastImage($uri);
list($width, $height) = $image->getSize();
echo "dimensions: " . $width . "x" . $height;
Tom
  • 1,191
  • 1
  • 13
  • 24
  • 2
    This works for me slower thant getimagesize from PHP: Fast Image:0.079681873321533s Native getimage size: 0.023485898971558s Raeger (webarto example): 0.16773s – catalinux Oct 18 '12 at 08:01
  • I'd be very interested if it is reproducibly slower, how many times did you run the test? You can check out the source code - normally under 1kb of the image is needed to be downloaded. – Tom Feb 04 '13 at 07:32
  • i really like this class. however you're using fopen. so you with take very long time to slow response remote request with overload server. why don't you use CURL instead? And some website need to fake header(http_referer) to retrieve image dimensions. So CURL is better. – TomSawyer Oct 21 '13 at 09:36
  • What if I want the filesize in bytes? – codecowboy Jan 23 '14 at 10:59
  • @codecowboy Use filesize() for local or a HTTP HEAD request and content-length header for remote – TheJosh Apr 29 '14 at 02:11
10

I was looking for a better way to handle this situation, so I used a few different functions found around the internet.

Overall, when it worked, the fastest tended to be the getjpegsize function that James Relyea posted on the PHP page for getimagesize, beating the ranger function provided by Dejan above. http://php.net/manual/en/function.getimagesize.php#88793

Image #1 (787KB JPG on external older server)
getimagesize: 0.47042 to 0.47627 - 1700x2340 [SLOWEST]
getjpegsize: 0.11988 to 0.14854 - 1700x2340 [FASTEST]
ranger: 0.1917 to 0.22869 - 1700x2340

Image #2 (3MB PNG)
getimagesize: 0.01436 to 0.01451 - 1508x1780 [FASTEST]
getjpegsize: - failed
ranger: - failed

Image #3 (2.7MB JPG)
getimagesize: 0.00855 to 0.04806 - 3264x2448 [FASTEST]
getjpegsize: - failed
ranger: 0.06222 to 0.06297 - 3264x2448 * [SLOWEST]

Image #4 (1MB JPG)
getimagesize: 0.00245 to 0.00261 - 2031x1434
getjpegsize: 0.00135 to 0.00142 - 2031x1434 [FASTEST]
ranger: 0.0168 to 0.01702 - 2031x1434 [SLOWEST]

Image #5 (316KB JPG)
getimagesize: 0.00152 to 0.00162 - 1280x720
getjpegsize: 0.00092 to 0.00106 - 1280x720 [FASTEST]
ranger: 0.00651 to 0.00674 - 1280x720 [SLOWEST]
  • ranger failed when grabbing 32768 bytes on Image #3, so I increase it to 65536 and it worked to grab the size successfully.

There are problems, though, as both ranger and getjpegsize are limited in ways that make it not stable enough to use. Both failed when dealing with a large JPG image around 3MB, but ranger will work after changing the amount of bytes it grabs. Also, these alternates only deal with JPG images, which means that a conditional would need to be used to only use them on JPGs and getimagesize on the other image formats.

Also, note that the first image was on an older server running an old version of PHP 5.3.2, where as the 4 other images came from a modern server (cloud based cPanel with MultiPHP dialed back to 5.4.45 for compatibility).

It's worth noting that the cloud based server did far better with getimagesize which beat out ranger, in fact for all 4 tests on the cloud server, ranger was the slowest. Those 4 also were pulling the images from the same server as the code was running, though different accounts.

This makes me wonder if the PHP core improved in 5.4 or if the Apache version factors in. Also, it might be down to availability from the server and server load. Let's not forget how networks are getting faster and faster each year, so maybe the speed issue is becoming less of a concern.

So, the end result and my answer is that for complete support for all web image formats, and to still achieve super fast image size, it might be best to suck it up and use getimagesize and then cache the image sizes (if these images will be checked more than once) in a database table. In that scenario, only the first check will incur a larger cost, but subsequent requests would be minimal and faster than any function that reads the image headers.

As with any caching, it only works well if the content doesn't change and there is a way to check if has been a change. So, a possible solution is to check only the headers of a image URL when checking the cache, and if different, dump the cached version and grab it again with getimagesize.

Exit
  • 973
  • 2
  • 13
  • 28