What would be the best string hashing function for say filename like strings? The strings would be similar to:
pics/test.pic
maps/test.map
materials/metal.mtl
What would be the best string hashing function for say filename like strings? The strings would be similar to:
pics/test.pic
maps/test.map
materials/metal.mtl
If the nature of data to be hashed doesn't require any fancy hashing algorithms, like the nature of textual strings, you may want to try the FNV hashing function. The FNV hash, short for Fowler/Noll/Vo in honor of the creators, is a very fast algorithm that has been used in many applications with wonderful results, and for its simplicity, the FNV hash should be one of the first hashes tried in an application.
unsigned int fnv_hash (void* key, int len)
{
unsigned char* p = key;
unsigned int h = 2166136261;
int i;
for (i = 0; i < len; i++)
h = (h*16777619) ^ p[i];
return h;
}
Or roll with MD5 algorithm instead, which is general-purpose and thus covers your needs quite well.
There is no universally "best" hashing function independently of how the hash are used.
Let's suppose you want to have a 32 bits int in order to use a small hash table in memory.
Then you can use the FNV-1a algorithm:
hash = offset_basis
for each octet_of_data to be hashed
hash = hash xor octet_of_data
hash = hash * FNV_prime
return hash
If your purpose is to be confident about the fact that two paths give different hash, then you can use the SHA1 algorithm.
If you want to be sure it's very hard to maliciously create collisions, then you can use SHA256.
Note that those 2 last algorithm generate long hash (longer than your typical path).
Just use std::hash<std::string>
. That's your library implementer's idea of the 'best' general purpose, non-cryptographic hash function.