Efficient fixed-size "lossy" hashtable with known number of maximum items and fixed key-size

Question

Given the fact that I will always have a 128-bit integer as a key, and a maximum number of elements, is it possible to make a perfect hashing function?

I want to store guids as keys, and I want a fixed-size, no-linked-list solution. That is if there is a collision I lose the second element.

Can someone recommend a hashing function that would do this?

Are these GUIDs generated locally? Only a small portion of each GUID will actually differ in that case - I forget the details but you may want to look up what GUIDs are made up of and only use the parts that are likely to differ the most. — 500 - Internal Server Error, Jan 23 '13 at 02:41
As it turns out, hashing on GUIDs has already been discussed [here](http://stackoverflow.com/questions/7326593/guid-gethashcode-uniqueness). — 500 - Internal Server Error, Jan 23 '13 at 02:49
@500-InternalServerError Actually, most modern GUID algorithms are almost entirely random bits, as opposed to older mechanisms in which about half was based on hardware keys and half on timestamp. — Servy, Jan 23 '13 at 05:05
@Servy: I'll take your word for it - I haven't studied the details recently. — 500 - Internal Server Error, Jan 23 '13 at 05:38
@500-InternalServerError http://blogs.msdn.com/b/ericlippert/archive/2012/05/07/guid-guide-part-three.aspx — Servy, Jan 23 '13 at 05:40
Think of your problem as a search over hash functions. If I understand your problem setup correctly, there certainly are one or more hash functions that give zero collisions. Finding one might be the tricky part. — David J., Dec 18 '19 at 07:23

score 0 · Answer 1 · answered Jan 23 '13 at 10:38

Given the fact that I will always have a 128-bit integer as a key, and a maximum number of elements, is it possible to make a perfect hashing function?

If the entropy (expected value of information contained in the variable) of the key is bigger than what the result of hashing can accommodate - the answer is no. If the entropy of the key is equal or lower than what the result of hashing can accommodate - the answer is yes.

What that means is that if your hashing function for example generates a 16-bit result (65536 possible values) and your 128 key variable can assume at most 65536 or fewer different values (so the entropy of the 128-bit key is at most 10-bits) than there exists such hashing function. On the other hand if your key can assume more than 65535 distinct values there is no way you can hash that into 65535 or less buckets.

Can someone recommend a hashing function that would do this?

Without knowing what possible values the key can assume no one can give you a hashing function that can do that even if you knew for a fact that the key entropy is so low that it could be done.

Efficient fixed-size "lossy" hashtable with known number of maximum items and fixed key-size

1 Answers1