I have one dataframe with less than 5000 rows (csv file). I have plenty of columns, one of them is the company name. However, there are many duplicates with different names, for example, one company can be called: HH 785 EN
And his duplicate could be called : HH 785EN or HH784 EN
Every duplicates have like 1 or 2 differents characters from the original company.
I'm looking for an algorithm that could potentially detect these duplicates. Most of the fuzzy match problems I have seen have 2 datasets involved which isn't my case. I have seen many algorithm which takes one word and a list as entry, but I want to check my whole column of companies names with itself.
Thanks for your help.