1

I am working on a project where I have to reference a list of names from one website to another. After grabbing the name from the first website, I locate the profile of the person within the second website, using that name. However, due to formatting issues sometimes the names are a bit different which leads to errors I have to manually fix.

For example, the first website may spell someone's name as "Erik Silva", whereas the second website spells it as "Erick Silva". I would have to manually change my name as below

if name == "Erik Silva":
            name = "Erick Silva"
 

Note that this also happens to the last names, as well - but usually not both. Does there exist some type of python package which allows me to automatically try multiple names or close name values to a name input? This would save my time manually adjusting each name.

I was only able to manually adjust the name values.

user54565
  • 21
  • 2
  • If you have defined mappings use a dictionary, if you want to guess the best match automatically use something like [`difflib`](https://stackoverflow.com/a/10018734/16343464). Or a combination of both... – mozway Aug 10 '23 at 03:40
  • I'd investigate if Levenshtein distance or something similar that does fuzzy matching between 2 names. Try https://github.com/seatgeek/thefuzz – rasjani Aug 10 '23 at 06:26
  • Would using a fuzzy matching be more optimal (in terms of time spent) versus manually fixing the names? I have over 500 entries that need to be fixed - I am not 100% sure on how fuzzy matching works - but from my understanding it generates a list of probable similar names. Would I have to iterate through the list of probable similar names for each person? – user54565 Aug 10 '23 at 23:36

0 Answers0