22

I'm using the imagesize gem to check the sizes of remote images and then only push images that are big enough into an array.

require 'open-uri'
require 'image_size'
data = Nokogiri::HTML(open(url))
images = []
forcenocache = Time.now.to_i # No cache because jquery load event doesn't fire for cached images
data.css("img").each do |image|
  image_path = URI.join(site, URI.encode(image[:src]))
  open(image_path, "rb") do |fh|
    image_size = ImageSize.new(fh.read).get_size()
    unless image_size[0] < 200 || image_size[1] < 100
      image_element = "<img src=\"#{image_path}?#{forcenocache}\">"
      images.push(image_element)
    end
  end
end

I tried using JS on the front-end to check image dimensions but there seems to be a browser limit to how many images can be loaded at once.

Doing it with imagesize is much slower than using JS. Any better and faster ways to do this?

Aen Tan
  • 3,305
  • 6
  • 32
  • 52
  • My recommendations are - at first find all image links on a page and filter out duplicates. Probably, loading only piece of image will work (few first kilobytes) - try this. Also, you may use threads to check multiple images in parallel. Probably, there are some img tags on a page with dimensions set. – taro May 08 '11 at 17:46
  • How would one go about reading first n kilobytes of the image to get the size using open-uri? – Aen Tan May 10 '11 at 05:00
  • here is link to my related question http://stackoverflow.com/questions/1120350/how-to-download-via-http-only-piece-of-big-file-with-ruby – taro May 10 '11 at 12:32

1 Answers1

50

I think this gem does what you want https://github.com/sdsykes/fastimage

FastImage finds the size or type of an image given its uri by fetching as little as needed

Luqman
  • 3,070
  • 1
  • 19
  • 9
  • 3
    I just want to say... I stumbled upon this answer, and therefor, this gem. It's incredible. +1 +1 +1 +1 +1!!! – Dudo Aug 29 '13 at 03:01