0

I'm coding a program which'll take an image for an input, check it against images in a database and output the image with the same hash

However, when using hash("imagepath") 2 of the same images give different hashes, even when the only difference is the image's name, which makes me believe the name is the issue

Is there a way to easily ignore the name of the image? (png)

1 Answers1

0

How I solved it: I ended up not using "hashing" but the average pixel by scrambeling pieces of code together, and then find an image with the same average pixel (the average pixels are in a list so it gets the index which it then uses to find a name)

import requests

#Database of possible image average pixels
clone_imgs = [88.0465, 46.2568, 102.6426 ...]

image = <image url>
img_data = requests.get(image).content
with open('image.png', 'wb') as handler: #Download the image as "image.png" (Replace "image.png" with the path where you want to save it)
    handler.write(img_data)
img = Image.open(r"image.png") #Open the image for reading
img = img.resize((100, 100), Image.ANTIALIAS) #A series of compressions to the image
img = img.convert("L")
img_pixel_data = list(spawn.getdata())
img_avg_pixel = sum(spawn_pixel_data)/len(spawn_pixel_data) #Get the average pixel values

clone_img_index = clone_imgs.index(img_avg_pixel) #Find the same pixel value in the database

This worked for me but it has a few downsides:

  1. The images need to be 100% the same in color (A single pixel off can ruin it)
  2. One of these average pixels can make an infinite amount of images, my database only contained 800 so it still worked (However I had to go from compression to 10x10 to 100x100 to not end up with clones)
  • I guess this is too late, but there is a [perceptual hash library](https://github.com/JohannesBuchner/imagehash) for Python. These are designed to produce matching or similar hashes even if the images are slightly different. – Nick ODell Jan 30 '22 at 20:04