what would be a good hash code for a vehicle identification number, that is a string of numbers and letters of the form "9X9XX99X9XX999999," where a "9" represents a digit and an "X" represents a letter?
-
... anything goes ... /AgainstMethod – wildplasser Jun 17 '21 at 13:54
-
2Can you elaborate a bit more on your use case? What properties would a "good" hash function have here? – templatetypedef Jun 17 '21 at 14:07
1 Answers
One reasonable approach is to hash the entire thing using a hash function suitable for strings, e.g. GCC's C++ Standard Library uses MURMUR32.
If you wanted to get more hands on, you could group all the digits to form one 11-digit number, and knowing the 6 letters can have 26 different values which is less than 2^5=32 - you could cheaply create a number from those letters (let's call them ABCDEF) by evaluating: A + B * 2^5 + C * 2^10 + D * 2^15 + E * 2^20 + F * 2^25
Then, separately hash both the 11-digit number and the number created from the letters with a decent hash function, and XOR or add the results; you'll have quite a good hash value for your VIN. I haven't personally evaluated it, but Thomas Mueller recommends and explains something ostensible suitable here:
uint64_t hash(uint64_t x) {
x = (x ^ (x >> 30)) * UINT64_C(0xbf58476d1ce4e5b9);
x = (x ^ (x >> 27)) * UINT64_C(0x94d049bb133111eb);
x = x ^ (x >> 31);
return x;
}

- 102,968
- 15
- 177
- 252