I have a table where a column has some misspelled strings, lets say as an example:
table$Status
returns these values:
"alive" "sic" "alive" "sick" "alive" "si" "alive" "ali" "alv"
"dead" "alive" "alive" "alive" "al" "dead" "dead" "de" "dead"
"dead" "dea" "dead" "al" "dead" "de" "al" "de" "sick"
"dead" "alive"
I want to have alive, sick or dead like the following example:
"alive" "sick" "alive" "sick" "alive" "sick" "alive" "alive" "alive"
"dead" "alive" "alive" "alive" "alive" "dead" "dead" "dead" "dead"
"dead" "dead" "dead" "alive" "dead" "dead" "alive" "dead" "sick"
"dead" "alive"
I know there is this function from the package RecordLinkage
to get the distance between strings like:
levenshteinSim("al", "alive")
So i will be comparing every single value with another and get the best similarities, also I know by using table(Table$Status)
I will get the number of the most repeated values and those will be the correct.
But here is my question how can I compare them all with each other and replace my table?? If someone knows an easy way to do it would be really helpful.