How can I convert the result of mdq.Similarity
to the number of edits needed for the two words to match. This function is part of Master Data Service (MDS) in Microsoft SQL Server defined as:
USE [mds]
ALTER FUNCTION [mdq].[Similarity](@input1 [nvarchar](4000), @input2 [nvarchar](4000), @method [tinyint], @containmentBias [float], @minScoreHint [float])
RETURNS [float] WITH EXECUTE AS CALLER, RETURNS NULL ON NULL INPUT
AS EXTERNAL NAME [Microsoft.MasterDataServices.DataQuality].[Microsoft.MasterDataServices.DataQuality.SqlClr].[Similarity]
The two words that are 1 edit away from each other produce different Levenshtein distance, which seems to account for their length (number of characters in the word).
SELECT a=mds.mdq.Similarity('a','',0,0,0),
ab=mds.mdq.Similarity('ab','a',0,0,0),
abc=mds.mdq.Similarity('abc','ab',0,0,0),
ac=mds.mdq.Similarity('ac','ab',0,0,0)
a ab abc ac
0 0.5 0.67 0.5
Whereas I need it to return 1 in each case because each pair has two words differing by a single edit (insertion, deletion, substitution).