I have smallish(?) sets (ranging in count from 0 to 100) of unsigned 32 bit integers. For a given set, I want to come up with minimal parameters to describe a minimal(istic) perfect hash of the given set. High level of the code I used to experiment with the idea ended up something like:
def murmur(key, seed=0x0):
// Implements 32bit murmur3 hash...
return theHashedKey
sampleInput = [18874481, 186646817, 201248225, 201248705, 201251025, 201251137, 201251185, 184472337, 186649073, 201248625, 201248721, 201251041, 201251153, 184473505, 186649089, 201248657, 201251009, 201251057, 201251169, 186646818, 201248226, 201248706, 201251026, 201251138, 201251186, 186649074, 201248626, 201248722, 201251042, 201251154, 186649090, 201248658, 201251010, 201251058, 201251170]
for seed in range(11111): // arbitrary upper seed limit
for modulus in range(10000):
hashSet = set((murmur(x, seed=seed) % modulus for x in sampleInput))
if len(hashSet) >= len(allValves):
print('minimal modulus', modulus, 'for seed', seed)
break
This is just basic pseudo code for a 2 axis brute force search. I add lines by keeping track of the different values, I can find seed and modulus values that give a perfect hash and then select the one with the smallest modulus.
It seems to me that there should be a more elegant/deterministic way to come up with these values? But that's where my math skills overflow.
I'm experimenting in Python right now, but ultimately want to implement something in C on a small embedded platform.