My question is not about double hashing technique http://en.wikipedia.org/wiki/Double_hashing , which is a way to resolve collisions. It is about handling existing collisions in hash table of strings. Say, we have a collision: several strings in the same bucket, so now we must go through the bucket checking the strings. It seems it would make sense to calculate another hash function for fast string comparison (compare hash values for quick rejection). The hash key could be lazily computed and saved with the string. Did you use such technique? Could you provide a reference? If not, do you think it's not worth doing since perfomance gain is questionable? Some notes:
- I put tag "Java" since I did measurements in Java: String.hashCode() in most cases outperforms String.equals() (and BTW greatly outperforms manual hash code calculation: hashCode = 31 * hashCode + strInTable.charAt(i));
- Of course, the same could be asked about any string comparison, not necessarily strings in a hash table. But I am considering a specific situation with huge amount of strings which are kept in hash table.
- This probably makes sense if the strings in the bucket are somewhat similar (like in Rabin-Karp algorithm). Looking for your opinion in general situation.