0

Let's say I have the following table:

Col_1 Col_2 Col_3
AltonJamesWebs AltonJamsWebs 96.55%
BuddarakhGrillIzme BuddarakhGriIzmezh 89.47%

How can I calculate the percentage of similarity between two string columns Col_1 and Col_2? I would like the percentage of similarity to be displayed in Col_3.

I tried using the LEVENSHTEIN distance, but it says:

'LEVENSHTEIN' is not a recognized built-in function name.

Thom A
  • 88,727
  • 11
  • 45
  • 75
Isaac A
  • 543
  • 1
  • 6
  • 18
  • Does [this](https://stackoverflow.com/questions/560709/levenshtein-distance-in-t-sql) answer your question? – Serg Jan 10 '23 at 15:58
  • 1
    *"I am using Microsoft SQL Server Management Stuido."* This isn't particularly helpful. SQL Server Management Studio (SSMS) is an IDE like application that can be used with (and not limited to) SQL Server, Azure SQL Edge and Azure SQL Database. The latest version of SSMS (19) also supports *many* versions of the same product; for SQL Server is supports SQL Server 2008-2022. If you want to tell us what (R)DBMS you are using, and the version, then please use the tags. – Thom A Jan 10 '23 at 16:02
  • 1
    As for the error, it is telling you the problem. There is no built-in function to calculate the levenshtein distance, let alone one call `LEVENSHTEIN`. You'll need to create a user-defined function to calculate it. – Thom A Jan 10 '23 at 16:03
  • @Larnu SQL Server. – Isaac A Jan 10 '23 at 16:03
  • Yes, you tagged that, @IsaacA , so we know that. My point was about that telling us you are using SSMS doesn't actually tell us much/anything other than you are (likely) using a Microsoft based (R)DBMS. It's a bit like telling us "I am using Visual Studio" when asked what language you are using. VS/SSMS are the IDE (like) applications, not the language/data engine. – Thom A Jan 10 '23 at 16:05
  • @Serg Hello Serg, Please post your answer in the answer box so that I can choose it as the solution. – Isaac A Jan 11 '23 at 15:39

1 Answers1

1

U can try that

CREATE FUNCTION [dbo].[GetPercentageOfTwoStringMatching]
 (
     @string1 NVARCHAR(100)
     ,@string2 NVARCHAR(100)
 )
 RETURNS INT
 AS
 BEGIN
    
     DECLARE @levenShteinNumber INT
    
     DECLARE @string1Length INT = LEN(@string1)
     , @string2Length INT = LEN(@string2)
     DECLARE @maxLengthNumber INT = CASE WHEN @string1Length > @string2Length THEN @string1Length ELSE @string2Length END
    
     SELECT @levenShteinNumber = [dbo].[LEVENSHTEIN] (   @string1  ,@string2)
    
     DECLARE @percentageOfBadCharacters INT = @levenShteinNumber * 100 / @maxLengthNumber
    
     DECLARE @percentageOfGoodCharacters INT = 100 - @percentageOfBadCharacters
    
     RETURN @percentageOfGoodCharacters
    
 END
    
    
 CREATE FUNCTION [dbo].[LEVENSHTEIN](@left  VARCHAR(100),
                                     @right VARCHAR(100))
 returns INT
 AS
   BEGIN
       DECLARE @difference    INT,
               @lenRight      INT,
               @lenLeft       INT,
               @leftIndex     INT,
               @rightIndex    INT,
               @left_char     CHAR(1),
               @right_char    CHAR(1),
               @compareLength INT
    
       SET @lenLeft = LEN(@left)
       SET @lenRight = LEN(@right)
       SET @difference = 0
    
       IF @lenLeft = 0
         BEGIN
             SET @difference = @lenRight
    
             GOTO done
         END
    
       IF @lenRight = 0
         BEGIN
             SET @difference = @lenLeft
    
             GOTO done
         END
    
       GOTO comparison
    
       COMPARISON:
    
       IF ( @lenLeft >= @lenRight )
         SET @compareLength = @lenLeft
       ELSE
         SET @compareLength = @lenRight
    
       SET @rightIndex = 1
       SET @leftIndex = 1
    
       WHILE @leftIndex <= @compareLength
         BEGIN
             SET @left_char = substring(@left, @leftIndex, 1)
             SET @right_char = substring(@right, @rightIndex, 1)
    
             IF @left_char <> @right_char
               BEGIN -- Would an insertion make them re-align?
                   IF( @left_char = substring(@right, @rightIndex + 1, 1) )
                     SET @rightIndex = @rightIndex + 1
                   -- Would an deletion make them re-align?
                   ELSE IF( substring(@left, @leftIndex + 1, 1) = @right_char )
                     SET @leftIndex = @leftIndex + 1
    
                   SET @difference = @difference + 1
               END
    
             SET @leftIndex = @leftIndex + 1
             SET @rightIndex = @rightIndex + 1
         END
    
       GOTO done
    
       DONE:
    
       RETURN @difference
   END 
    
   SELECT  *,[dbo].[GetPercentageOfTwoStringMatching](MaskText  ,'TR00123907FG') as[percent] from #compareTextPercentage
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jan 11 '23 at 00:56