In my table, I have two columns of the Datatype 'Varchar' with the same length. These columns are binary representations of information, and I would like to select the Hamming Distance between them, but I can't find any function to do that. Does anyone have any guidance?
Asked
Active
Viewed 139 times
0
-
Does this answer your question? [Hamming Distance / Similarity searches in a database](https://stackoverflow.com/questions/9606492/hamming-distance-similarity-searches-in-a-database) – SMor Jul 14 '22 at 17:40
1 Answers
1
Create an auxiliary numbers table with unique sequential integers at least up until the max string length and then you can do
SELECT CASE WHEN LEN(@String1) = LEN(@String2) THEN COUNT(*) END
FROM dbo.Numbers
WHERE Number BETWEEN 1 AND LEN(@String1)
AND SUBSTRING(@String1, Number, 1) <> SUBSTRING(@String2, Number, 1)
In future product versions the auxiliary numbers table can be swapped out by the GENERATE_SERIES
function.
NB: If you are planning on running this calculation over many strings - e.g. to return the row with the closest Hamming Distance you'll likely find performance an issue though...

Martin Smith
- 438,706
- 87
- 741
- 845
-
Perhaps add ... AND LEN(@String1) = LEN(@String2) Either way +1 – John Cappelletti Jul 14 '22 at 17:49
-
1@JohnCappelletti - the "with the same length" was a given in the question here. I'll add that anyway though to be on the safe side (and replace the `COUNT` with `SUM(1)` so `NULL` is returned for that case rather than zero – Martin Smith Jul 14 '22 at 17:52