So I was looking at the hash functions, and figured out that given 2 similar strings, even if the differ by a single bit, the result would be a completely different hash key. I actually need to create some sort of unique id, which has this feature of being quite similar for similar input (will be millions of alpha numerical strings).
Example:
- two equal strings must have the same hash.
- two different strings must have different hash.
- two different strings, that are quite similar must have different hashes that at the same time are not too far from each other.
what would be a good approach to achieve that? I am using python.