Let's say I have a set of 2 words:
Alexander and Alecsander OR Alexander and Alegzander
Alexander and Aleaxnder, or any other combination. In general we are talking about human error in typing of a word or a set of words.
What I want to achieve is to get the percentage of matching of the characters of the 2 strings.
Here is what I have so far:
DECLARE @table1 TABLE
(
nr INT
, ch CHAR
)
DECLARE @table2 TABLE
(
nr INT
, ch CHAR
)
INSERT INTO @table1
SELECT nr,ch FROM [dbo].[SplitStringIntoCharacters] ('WORD w') --> return a table of characters(spaces included)
INSERT INTO @table2
SELECT nr,ch FROM [dbo].[SplitStringIntoCharacters] ('WORD 5')
DECLARE @resultsTable TABLE
(
ch1 CHAR
, ch2 CHAR
)
INSERT INTO @resultsTable
SELECT DISTINCt t1.ch ch1, t2.ch ch2 FROM @table1 t1
FULL JOIN @table2 t2 ON t1.ch = t2.ch --> returns both matches and missmatches
SELECT * FROM @resultsTable
DECLARE @nrOfMathches INT, @nrOfMismatches INT, @nrOfRowsInResultsTable INT
SELECT @nrOfMathches = COUNT(1) FROM @resultsTable WHERE ch1 IS NOT NULL AND ch2 IS NOT NULL
SELECT @nrOfMismatches = COUNT(1) FROM @resultsTable WHERE ch1 IS NULL OR ch2 IS NULL
SELECT @nrOfRowsInResultsTable = COUNT(1) FROM @resultsTable
SELECT @nrOfMathches * 100 / @nrOfRowsInResultsTable
The SELECT * FROM @resultsTable
will return the following:
ch1 ch2
NULL 5
[blank] [blank]
D D
O O
R R
W W