0

I want to join two tables with best match strings. Table A contains some values which do not match 100% with Table B.

But I want to join Table A with Table B having a maximum matching record.

Following is a sample of Table A, Table B, and required output with some logic.

enter image description here

enter image description here

enter image description here

Following is the sample tables:

drop table if exists #C
CREATE TABLE #C(ID int, ReferenceID varchar (100))
insert into #C values

(1,'ABC123'),
(2,'456XYZ'),
(3,'5PQR3'),
(4,'789XYZ'),
(5,'789A')


drop table if exists #D
CREATE TABLE #D(ReferenceID varchar (100), FirstName varchar (100))
INSERT INTO #D VALUES
('XYZ','Olivia'),
('ABD123','Emma'),
('45XYZ','William'), 
('ABC456','Jackson')
Teknas
  • 541
  • 5
  • 17
  • [this](https://stackoverflow.com/questions/921978/fuzzy-matching-using-t-sql) might help you. – sacse Apr 26 '20 at 10:11

1 Answers1

1

You could use the GetPercentageOfTwoStringMatching of this post here on StackOverFlow. With this you could write the following:

select A.ID,A.ReferenceID, B.FirstName, [dbo].[GetPercentageOfTwoStringMatching](A.ReferenceID  ,B.ReferenceID) as strMatch from Table1 as A
left outer join Table2 as B on [dbo].[GetPercentageOfTwoStringMatching](A.ReferenceID  ,B.ReferenceID) > 30

(I used Table1 and Table2 instead of #C and #D in your example)

Which gives the following output:

enter image description here

You still have to filter out the double first names in this example.

wserr
  • 436
  • 1
  • 3
  • 12