Before I begin, all code samples here I tested on Mono environment and there is one noticeable difference in the GetHashCode
implementations:
string.Empty.GetHashCode(); // returns 0 in Mono 3.10
string.Empty.GetHashCode(); // returns 757602046 in .NET 4.5.1
I made my implementation based on this SO Answer by @JonSkeet and in the comments he also suggests to use 0 hash code for NULL values (wasn't sure how should I hash them).
I usually use 0 as the effective hash code for null - which isn't the same as ignoring the field.
So having following implementation (Mono 3.10):
public class Entity {
public int EntityID { get; set; }
public string EntityName { get; set; }
public override int GetHashCode() {
unchecked {
int hash = 15485863; // prime number
int multiplier = 1299709; // another prime number
hash = hash * multiplier + EntityID.GetHashCode();
hash = hash * multiplier + (EntityName != null ? EntityName.GetHashCode() : 0);
return hash;
}
}
}
It is quite easy to find collision e.g.
var hash1 = new Entity { EntityID = 1337, EntityName = "" }.GetHashCode();
var hash2 = new Entity { EntityID = 1337, EntityName = null }.GetHashCode();
bool equals = hash1 == hash2; // true
I could replace null-value 0 with some other number, however it won't fix the problem as there still is a chance that some hash(string) output will generate such number and I'll get another collision.
My question: How should I handle null values while using algorithm from example above?