pHash or perceptual hash is an algorithm for creating fingerprints of multimedia data (images, audio, etc).
What is a perceptual hash?
A perceptual hash is a fingerprint of a multimedia file derived from various features from its content. Unlike cryptographic hash functions which rely on the avalanche effect of small changes in input leading to drastic changes in the output, perceptual hashes are "close" to one another if the features are similar.
Relevance of Perceptual Hashing
Perceptual hashes must be robust enough to take into account transformations or "attacks" on a given input and yet be flexible enough to distinguish between dissimilar files. Such attacks can include rotation, skew, contrast adjustment and different compression/formats. All of these challenges make perceptual hashing an interesting field of study and at the forefront of computer science research.
What is pHash?
pHash
is an open source software library released under the gplv3 license that implements several perceptual hashing algorithms, and provides a c-like API to use those functions in your own programs. pHash
itself is written in c++. pHash
was created by Evan Klinger.
Project URL: http://www.phash.org