I've done some searching and found a few useful posts about supporting perfect (i.e. no collision) hashing in Java.
Why doesn't Java's hashCode support universal hashing?
But I am looking for a practical solution, hopefully in the form of a tested library. I have a situation which is suitable for perfect hashing: essentially, we can assume that the set of keys is fixed, and the program runs for a long time and does a lot of lookups. (This is not exactly true, but keys are added rarely enough that it is a close enough approximation, and if I have to re-hash periodically or something to deal with that, that's OK).
Basically, I would like to be able to increase the load factor and also reduce collisions. In other words, the objectives are to reduce memory use, and increase throughput (i.e. number of lookups per second).
There are some issues. Obviously, there is the problem that if hashCode()
doesn't return distinct values, then perfect hashing is impossible. And there are other considerations besides the hashing algorithm, like the complexity of hashCode()
(or whether I should cache the hashcodes on the key objects, etc) or whatever other function I use to initially map my objects to integers or longs.
What I am envisioning is being able to re-hash in a background thread, trying different hash functions to find a perfect one or at least a good one. I'm open to another solution though. And I would like to use tested code rather than write it myself, though I am open to that too.