0

In a Django app, users upload various photos and get upvoted/downvoted (kind of like 9gag).

I want to put in place a basic check that prevents the user from re-submitting images already recently submitted on the website.

I don't need an airtight solution. How my question differs from other such questions on SO is that this isn't just a case of comparing two images, this is a case of comparing an uploaded image to , say, the 200 most recently uploaded images (my arbitrary cut-off). Performance takes the front seat.

Since I'm thumbnailing all images already (40px x 40px), I'm going to compare photo thumbnails instead of full-blown photos. This will be equivalent to comparing down-sampled objects, thus it'll be faster and more fuzzy (which is good).

My question is: is there a decent way to reduce image histograms to a unique number (of base 10 or 16, for instance)? If there is, I can store them in the DB, find the distance between such values, and impose an arbitrary cut-off. An illustrative example would be nice. This, in my head, sounds like the fastest way to handle my case.

Alternatively, if it can't be done due to various reasons, that's a legit answer too.

Community
  • 1
  • 1
Hassan Baig
  • 15,055
  • 27
  • 102
  • 205

1 Answers1

1

You probably want to use some sort of perceptual image hashing. I haven't tried it, but looks like https://pypi.python.org/pypi/ImageHash might do the trick.

maxymoo
  • 35,286
  • 11
  • 92
  • 119