0

I am looking for a solution to identify duplicate products from their images to speed up my product database workflow.

I accept product listings from many suppliers and have thousands of listings, totalling hundreds of thousands of images. The same product may be stocked by several suppliers. Each supplier may use the same images but with different watermarks or size. Each supplier may describe the product slightly differently.

On my website, I only want to list each individual product from one supplier. If I am sent a product that I already have, I want to efficiently identify the duplicate and ignore the new product.

I currently use some Regex and text searching to help me identify duplicates but it's not foolproof and is slow. I have read about hashing each image and searching that way, but my duplicate images aren't exactly the same.

NB. I am using Windows. I do not know Python or Java. I do have a range of technical knowledge but I haven't yet found anything that isn't "first become an expert in Java, then..."

Is there a Windows app or API or something out there that, given a set of images, can return back duplicates?

user2470281
  • 107
  • 9
  • sounds similar to https://stackoverflow.com/questions/5730631/image-similarity-comparison – shikida Jul 02 '21 at 13:19
  • I saw that thread. It's 10 years old, and doesn't give any solutions I could use in a workflow to compare an image against many images. – user2470281 Jul 05 '21 at 09:32

0 Answers0