I'm currently cleaning a dataset that shows names and gifts received by each person.
Each row goes like this:
Name | Gift |
---|---|
Agustin Dellagiovanna | Chocolate |
Agustín Delalgiovanna | Furniture |
Agustín Dellagiovanna | Art |
As you can see in this example, these three rows represent the same person. But two of them have different typos. The same thing happens with a lot of names in the dataset.
I wanted to know if there is a way for me to find these variations of the same name and replace them with the correct spelling of the name.
For now my only idea is to find each variation after checking the list of unique values in that column, but this's very time consuming given that the dataset has 45954 rows.
Any ideas?
Thanks in advance!