I have a sizeable data set that includes several hundred company names and looks something like this:
Name:
Earth Ltd.
Rocket International LLC
Space Corp LLC
Space Corporation LLc
Space International Corporation Ltd
Satellite Global
Some entries are just different spellings (sometimes misspellings or renaimings) or (for my purposes) the same company. I am trying to collapse these different spellings into one consistent version, e.g. Space Corp LLC, Space Corporation LLc, Space International Corporation Ltd
into Space Corp. LLC
.
Is there a script or package that lets me extract syntactically or otherwise similar entries, so I can see which entries I need to collapse?
Thanks a lot!