7

While reading the pigeonhole principle on Wikipedia, I come across - "collisions are inevitable in a hash table because the number of possible keys exceeds the number of indices in the array. No hashing algorithm, no matter how clever, can avoid these collisions". But isn't gperf doing this exactly?

Please enlighten.

Paolo
  • 20,112
  • 21
  • 72
  • 113
user138645
  • 772
  • 1
  • 7
  • 18
  • A single collision is not really the end of the world. A hash table that has collisions lists on the order of the table size is a problem. – artless noise Sep 02 '22 at 14:58

2 Answers2

5

gperf creates the hash function and the hash table basing on a predefined list of strings.

It therefore appears to me that gperf creates the hashes long enough so that there are enough possibilities.
That's what you can do only if you know every possible key upfront - which is an assumption which didn't hold for the description in the wikipedia's entry, which was apparently related to a "non-constant" hash-table.

Kos
  • 70,399
  • 25
  • 169
  • 233
4

From the gperf's website: "For a given list of strings, it produces a hash function and hash table,..." - which means it has to know all the strings previously in order to prepare a function, that works without collisions.

Usual hash functions you are using in general programming languages are able to handle any strings as input one after another (the list is not given at once) and therefore they can produce collisions.

eumiro
  • 207,213
  • 34
  • 299
  • 261