1

I'm trying to compare two images and see if they are identical. They should have the same dimensions, may have the same size, but the content will change sometimes, I want to be able to detect it.

I have two ways of doing it in my case: One is to get the number of colors in each image. (In my case the number of colors change if the images are different)

Or to indeed compare the files using a image processor. I've opted to use ruby-vips8 because it's known to be a lot faster than RMagick, and in my case performance is a thing.

I made some scratching with the ruby-vips8 but I can't find a way to compare two images or to get the number of colors (so I can compare using this method).

Any help?

ruby-vips8 is a wrapper of libvips.

http://www.rubydoc.info/gems/ruby-vips8/0.1.0/Vips/ http://www.vips.ecs.soton.ac.uk/index.php?title=VIPS

UPDATE:

With the answer of the user Aetherus I just realized I don't even need ruby-vips8 to do such task. I'm comparing the files as String (as he suggested). It's working great for me and it's also really fast.

I don't marked his answer as the best because my question asked if it's possible to do so using the ruby-vips8. Was a lib specific scenario so in such conditions the user894763 answer is more appropriated.

Nakilon
  • 34,866
  • 14
  • 107
  • 142
fschuindt
  • 821
  • 1
  • 9
  • 23
  • It looks like vips supports histograms, and comparing histograms is one way to compare images https://stackoverflow.com/questions/6499491/comparing-two-histograms. OpenCV can provide more sophisticated ways to compare images https://stackoverflow.com/questions/11541154/checking-images-for-similarity-with-opencv – Scott Jacobsen Apr 10 '16 at 02:32

2 Answers2

8

There must be hundreds of ways of measuring image similarity, it's a huge field. They vary (mostly) in what features of an image they try to consider.

A family of similarity measures are based on histograms, as Scott said. These techniques don't consider how your pixels are arranged spatially, so your two images could be considered the same if one has been rotated 45 degrees. They are also fast, since finding a histogram is quick.

A simple histogram matcher might be: find the histograms of the two input images, normalise (so the two hists have the same area ... this removes differences in image size), subtract, square and sum. Now a small number means a good match, larger numbers mean increasingly poor matches.

In ruby-vips this would be:

require 'vips'

a = Vips::Image.new_from_file ARGV[0], access: :sequential
b = Vips::Image.new_from_file ARGV[1], access: :sequential

# find hists, normalise, difference, square
diff_hist = (a.hist_find.hist_norm - b.hist_find.hist_norm) ** 2

# find sum of squares ... find the average, then multiply by the size of the
# histogram
similarity = diff_hist.avg * diff_hist.width * diff_hist.height

puts "similarity = #{similarity}"

On my desktop, this runs in about 0.5s for a pair of 2k x 3k JPEG images.

Many matchers are based on spatial distribution. A simple one is to divide the image into an 8x8 grid (like a chess-board), take the average pixel value in each square, then set that square to 0 or 1 depending on whether the average of the square is above or below the average of the whole image. This gives something like a fingerprint for the image which you can store neatly in a 64-bit int. It's insensitive to things like noise, scale changes or small rotations.

To test two images for similarity, XOR their fingerprints and count the number of set bits in the result. Again, 0 would be a perfect match, larger numbers would be less good.

In ruby-vips, you could code this as:

require 'vips'

a = Vips::Image.new_from_file ARGV[0], access: :sequential

# we need a mono image
a = a.colourspace "b-w"

# reduce to 8x8 with a box filter
a = a.shrink(a.width / 8, a.height / 8)

# set pixels to 0 for less than average, 255 for greater than average
a = a > a.avg

a.write_to_file ARGV[1]

Again, this runs in about 0.5s for a 2k x 3k JPEG.

Yet another family would be based on correlation, see spcor and friends. They might be more useful for finding a small area of an image.

Many fancier image similarity metrics will take a variety of algorithms, run them all, and use a set of weighting factors to compute an overall similarity measure.

jcupitt
  • 10,213
  • 2
  • 23
  • 39
  • Very good explanation, thank you. I'll update my post about my problem. But you sure have the best (and the right) answer. – fschuindt Apr 11 '16 at 19:23
0

"Are the same" and "look the same" are two different things.

If you want to verify if 2 images "are the same", then just read them into 2 strings and compare them.

def same_image?(path1, path2)
  return true if path1 == path2
  image1 = File.read(path1, 'rb')
  image2 = File.read(path2, 'rb')
  image1 == image2
end

Or if your images are large, then just read them byte by byte and compare.

def same_image?(path1, path2)
  return true if path1 == path2
  File.open(path1, 'rb') do |image1|
    File.open(path2, 'rb') do |image2|
      return false if image1.size != image2.size
      while (b1 = image1.read(1024)) and (b2 = image2.read(1024))
        return false if b1 != b2
      end
    end
  end
  true
end

Verifying if 2 images "look the same" is a very hard job. For example, a PNG and a JPG may look identical, but they almost never have the same pixel array. Even the 2 images are of the same type, they may look the same but actually the second image a one-pixel offset comparing to the first one, or the saturation between the 2 images are a little bit different, or ...

I've never done that, and I'm not sure if it is doable.

Aetherus
  • 8,720
  • 1
  • 22
  • 36
  • Thank you, I'm actually using your method of comparing the files as strings. Turns out that worked better for me. I'll update my post explaining why I did not used your post as the best answer. Thank you again. – fschuindt Apr 11 '16 at 19:24