I've been reading this article because it was linked by Jon Skeet on this answer. I'm trying to really get an understanding of how hashing works and why Jon likes the algorithm he provided so much. I'm not claiming to have an answer to that yet, but I do have a specific question about the base System.String
implementation of GetHashCode
.
Consider the code, focusing on the annotated <<<<<==========
line:
public override unsafe int GetHashCode()
{
if (HashHelpers.s_UseRandomizedStringHashing)
return string.InternalMarvin32HashString(this, this.Length, 0L);
fixed (char* chPtr = this)
{
int num1 = 352654597;
int num2 = num1;
int* numPtr = (int*) chPtr;
int length = this.Length;
while (length > 2)
{
num1 = (num1 << 5) + num1 + (num1 >> 27) ^ *numPtr;
num2 = (num2 << 5) + num2 + (num2 >> 27) ^ numPtr[1];
numPtr += 2;
length -= 4; <<<<<==========
}
if (length > 0)
num1 = (num1 << 5) + num1 + (num1 >> 27) ^ *numPtr;
return num1 + num2 * 1566083941;
}
}
Why do they only process every fourth character? And, if you're willing enough, why do they process it from right to left?