I just learned that:
- Dictionaries in .NET are implemented as Hash tables from this answer and the linked MSDN article about the
Dictionary<TKey, TValue>
Class . - The string hash function
GetHashCode()
does not provide a unique hash code value for each unique string value. Different strings can return the same hash code, according to the corresponding MSDN article about the string class.
This leads me to think, that Dictionaries in .NET (at least when using strings as keys) are susceptible to key collisions.
What happens on such a key collision? Are there any known unique string values, that actually collide? Will the dictionary be broken on those key values?
Additionally:
- Does it depend, whether the code is running on 32bit or 64bit system?
- Is using short strings up to a specific lenght safe? Safer?
Note: I am not referring to a specific .NET CLR, but if it matters, let's talk about the 4.5.2 32bit version for the desktop.
Notes about duplicates:
- Actually I am not so much asking about the collisions themselves, but the implications of them with regard to functionality/correctness.
- Can 2 different string have the same hash code in C#? addresses the fact that strings have non-unique hashes, which I already know and do not ask about. This is also true for What is hashCode used for? Is it unique?
- I removed the part about the the likeliness of the key collision thus Probability of getting a duplicate value when calling GetHashCode() on strings should no more be a duplicate.
- What happens when hash collision happens in Dictionary key? helped me out, so I consider this question a duplicate.