Name Matching in python

Question

We have a third party 'tool' which finds similar names and assigns a similarity score between two names.

I am supposed to mimic the tool's behavior as closely as possible. After searching over internet, gave a shot at distance method.Used fuzzywuzzy for the same.

matches = process.extractBests(
    name, 
    choices, 
    score_cutoff=50, 
    scorer=fuzz.token_sort_ratio,
    limit=1 
);

It gave results close to the tool result.However there are few outliers - as highlighted below.

After further searches over internet, I have come to the understand that further refinement will need implementation of machine learning of sort. I am a complete newbie in the machine learning world - so seeking some advice as to where I should attempt at next for further code refinement.

Thanks!

https://stackoverflow.com/questions/2923420/what-is-a-simple-fuzzy-string-matching-algorithm-in-python — Chris_Rands, May 27 '19 at 13:34
Can I ask what 3rd party tool you were using for the first column? — Stpete111, Jul 01 '20 at 20:19
@Stpete111 The tool is bridger - https://risk.lexisnexis.com/products/bridger-insight-xg — Soumya, Jul 11 '20 at 02:33
Thanks. Ah ok, so an actual full search solution. I thought you meant a 3rd-party name-match algorithm to which you have access to implement into your own code. — Stpete111, Jul 11 '20 at 20:13

score 4 · Answer 1 · answered Mar 11 '21 at 22:10

4

Take a look at this package. It is tailor-made for Name Matching HMNI Package

answered Mar 11 '21 at 22:10

Yash M

336
3
7

score 0 · Answer 2 · answered May 27 '19 at 13:34

0

Take a look at the Jaccard and Levenshtein algorithms for fuzzy string matching. Both are relatively simple and can be implemented in about 40 or 50 lines of code.

answered May 27 '19 at 13:34

Michael Bianconi

5,072
1
10
25

fuzzywuzzy library mentioned in the question uses Levenshtein similarity – pedram bashiri Mar 12 '20 at 20:24

Name Matching in python

2 Answers2