Fuzzywuzzy is not giving proper result

Question

I want to check this - user='Jefferey Roberts' and fuzzywuzzy is giving this result - result=[('Jeremiah James Roberts Jr', 86), ('Jeffrey Scott Roberts', 81), ('Jeremiah J Roberts', 71)]

Code -

from fuzzywuzzy import process
user='Jefferey Roberts'
result=['Jeremiah James Roberts Jr', 'Jeffrey Scott Roberts', 'Jeremiah J Roberts']
output=process.extract(user,result)
print(output)

It should have given more scores to the second element of the result list.

And similarly, if I am using get_close_matches of difflib module for this list ['Gary Wayne Waller', 'Zayn Waller', 'Debra Kay Waller'] and search for 'Gary Waller', it returns Zayn Waller instead of Gary Wayne Waller at first index'

Code-

from difflib import get_close_matches
user='Gary Waller'
result= ['Gary Wayne Waller', 'Zayn Waller', 'Debra Kay Waller']
output=get_close_matches(user,result)
print(output)

Please help with any solution or any better accurate module other than fuzzywuzzy and get_close_matches.

can can you please give supporting code so that we can reproduce the problem? — Harsh Gupta, Oct 01 '22 at 11:16
what are you expecting for the result ? https://stackoverflow.com/help/minimal-reproducible-example — D.L, Oct 01 '22 at 11:27

Sachin Kohli · Answer 1 · 2022-10-01T17:21:37.560

0

You can use "SequenceMatcher"

from difflib import SequenceMatcher

b = "Jefferey Roberts"
a_lst = ['Jeremiah James Roberts Jr', 'Jeffrey Scott Roberts', 'Jeremiah J Roberts']

for a in a_lst:
    print(a,SequenceMatcher(None, a, b).ratio())

Output;

Jeremiah James Roberts Jr 0.5853658536585366
Jeffrey Scott Roberts 0.8108108108108109
Jeremiah J Roberts 0.7058823529411765

Edit:

Checkout this post on similar match b/w strings to see all kinds of algorithm/package available for the matching... Find the similarity metric between two strings

edited Oct 01 '22 at 17:21

answered Oct 01 '22 at 11:19

Sachin Kohli

1,956
1
1
6

But if you take this list ['Gary Wayne Waller', 'Zayn Waller', 'Debra Kay Waller'] and search for 'Gary Waller', it returns high value for Zayn Waller instead of Gary Wayne Waller' – faizan khan Oct 01 '22 at 15:45
I've edited my answer with a link... probably you can find that one module which works for your problem statement...Thanks – Sachin Kohli Oct 01 '22 at 17:22

Fuzzywuzzy is not giving proper result

1 Answers1