A good hash function should
- be bijective to not loose information, where possible, and have the least collisions
- cascade as much and as evenly as possible, i.e. each input bit should flip every output bit with probability 0.5 and without obvious patterns.
- if used in a cryptographic context there should not exist an efficient way to invert it.
A prime number modulus does not satisfy any of these points. It is simply insufficient. It is often better than nothing, but it's not even fast. Multiplying with an unsigned integer and taking a power-of-two modulus distributes the values just as well, that is not well at all, but with only about 2 cpu cycles it is much faster than the 15 to 40 a prime modulus will take (yes integer division really is that slow).
To create a hash function that is fast and distributes the values well the best option is to compose it from fast permutations with lesser qualities like they did with PCG for random number generation.
Useful permutations, among others, are:
- multiplication with an uneven integer
- binary rotations
- xorshift
Following this recipe we can create our own hash function or we take splitmix which is tested and well accepted.
If cryptographic qualities are needed I would highly recommend to use a function of the sha family, which is well tested and standardised, but for educational purposes this is how you would make one:
First you take a good non-cryptographic hash function, then you apply a one-way function like exponentiation on a prime field or k
many applications of (n*(n+1)/2) mod 2^k
interspersed with an xorshift when k
is the number of bits in the resulting hash.