3

Possible Duplicate:
Super fast getimagesize in php

I want to get the image sizes of all the images within webpages (hence all img tags)

The problem is PHP's getimagesize() function is so inefficient.

I tried running this...

for($i=0;$i<55;$i++){
  getimagesize('http://www.blackcommentator.com/265/265_images/265_cartoon_do_something_now_bill_large.jpg');
}

..and it took a very long time to complete.

I'm wondering if anybody knows of a more efficient alternative to getimagesize() in order to obtain the sizes of images in a webpage using PHP when it is typical to get the sizes of 20-30 images.

Community
  • 1
  • 1
kamikaze_pilot
  • 14,304
  • 35
  • 111
  • 171
  • 2
    I'd imagine this is a web limitation (ie the round trip for actually grabbing the image takes longer than actually calculating its size). You might want to benchmark this on an image that is on the server that you're running the code from. – Bailey Parker Nov 10 '11 at 03:52
  • not really, the aim is pretty much to get all images within any single webpage, and clearly it doesn't take long to download those images otherwise all webpages will be slow....even when I simply took all the img tags of say cnn.com and compute their image size, it'll still take a long time even though cnn.com as well as all its images loads in a matter of seconds if you simply just go to cnn.com – kamikaze_pilot Nov 10 '11 at 03:54
  • And cnn.com probably has a much better CDN than blackcommentator.com. Right now there are 84 images used on CNN.com's homepage but not all of them are 400 x 400 unoptimized jpegs. Try a benchmarking how long it takes to do a `file_get_contents()` on the image. I'll be suprised if most of the time from your original benchmark isn't consumed by downloading. – Bailey Parker Nov 10 '11 at 04:01
  • 3
    Your browser downloads multiple images in parallel. Your PHP script is downloading the same image, 55 times in a row. The overhead of setting up the TCP connection probably outweighs the time spent finding the image's dimensions. The problem is the network. You cannot algorithmically make your network connection faster, and PHP doesn't have a function to do this. – user229044 Nov 10 '11 at 04:12
  • in that case how does facebook do something along this line...if you try to share cnn.com to facebook, it will immediately display a list of potential thumbnails taken from the site's images...I notice that these thumbnails have facebook urls which means that they actually downloaded all these images and stored them in their server, all happening in an instant....any idea on how they manage to do this? – kamikaze_pilot Nov 10 '11 at 04:26
  • Facebook doesn't download the images to their server. They simply link to them and have the user's browser download and display them. At least, this is what it sounds like. – Bailey Parker Nov 19 '11 at 22:58
  • I think you're sense of scale might be off if you think you should be getting the same performance as [facebook](http://www.tomshardware.com/news/facebook-servers-power-wattage-network,16961.html) – ChatGPT Sep 18 '12 at 13:18

2 Answers2

6

Use curl to save the images, but run the curl requests in parallel - this way it will load much faster (the bottleneck is not the bandwidth, it's the establishing request time, so this will help). Once you've saved the images to a local directory, then run getimagesize() on all of them.

Tim
  • 14,447
  • 6
  • 40
  • 63
0

For starters, cache the image locally. You're hauling it across the network 55 times. The overhead of downloading the image drastically outweighs the actual time spent finding its width and height.

If you're talking about finding the size of 55 different images, you might considering parallelizing your code somewhat. Your probably spending as much time setting up connections are you are transferring actual data, and since downloading one image at a time is probably nowhere near saturating your Internet connection, you stand to literally double your performance by running two concurrent processes. Continue to increase concurrency until you stop seeing performance gains.

user229044
  • 232,980
  • 40
  • 330
  • 338
  • ye but even with 55 different images (hence downloading is a must) it's still slow...that code above is just an example and means nothing really – kamikaze_pilot Nov 10 '11 at 03:51
  • 1
    There is no way. You're fetching the images across the network, it's the *network* that's slow. If you can't obtain the images locally, it can't be helped beyond installing a faster Internet connection. – user229044 Nov 10 '11 at 04:11