-1

I'm trying to harmonize client names across several different files by cross-referencing a common name associated with several different variations found in different files.

client name harmonization table

Tried fuzzy matching with grep with some success but still left with many client names not resolved correctly. Decided to use and intermediary table to find names in the offending files and replace with a common standard name. The R code below works great on short lists but when running on a list of 225K+ clients it bogged down pretty hard. I know that For loop is not the most efficient way to do this so looking for other suggestions

`##Substitute common name in interactions file
nrow <- nrow(interactions)
for(i in 1:nrow){
mywildcard <- interactions$client[i]
match_location <- match(mywildcard,common_client$service_center)
interactions$client[i] <- common_client$common[match_location]
}`
  • 1
    Please share a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) or [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) with an example input and your expected output. – Martin Gal Apr 06 '23 at 22:29

1 Answers1

0

Hard to test without the data (try dput for sharing the data), but here is my attempt:

match_location = match(interactions$client, common_client$service_center)
interactions$client = common_client$common[match_location]
Andrey Shabalin
  • 4,389
  • 1
  • 19
  • 18