0

I have a data frame with city names, and some of the names are misspelled etc. I have another csv file that has all the mistakes with the corrected city name next to it.

Sample of my data:

> df$CITY
CITY
Bostn
Los angeles
Chicagoo
NYC

csv file with corrected names

> head(city)
city_correct   city_incorr
Boston         Bostn
Los Angeles    Los angeles
Chicago        Chicagoo
New York City  NYC

How could I use my "city.csv" file to rename incorrect names in my df$CITY column?

Juli
  • 3
  • 2
  • 1
    `merge(df, city, by.x = 'CITY', by.y = 'city_incorr')` – Ronak Shah May 15 '20 at 00:04
  • My df$CITY column has over 20k obs and 600 city names, some are misspelled but many are spelled correctly. Merge will only create a new column at the end of df with correct names, you still need an extra step to transfer over the correct city names as well. This method gets you halfway there, but ideally it would be faster to REPLACE the mispelled names in df$CITY based on information from the city.csv file. – Juli May 15 '20 at 16:58
  • Also for anyone using the merge() code, include "all=T" in order to keep obs that didn't have a city.csv match. – Juli May 15 '20 at 16:58

0 Answers0