1

I have two files of customer records I need to merge. The addresses don't always match because they are not the same address, or there is part missing. I have tried fuzzy join without success. I basically need to compare 4 columns and assemble a complete address. Example below For example the first record I need to pull out 320th 15th NW, the 6th record down with II need 614 W Superior St. The address parts are spread across these 4 columns. Any help you could provide would be excellent.

data

  • You could try creating some string checks. E.g. test which column contains some of 'street' or 'ST' or 'ave' or a number followed by st/nd/rd/th. Or sth. Like that and then pull the info from that respective column. It should probably work in 95% of the cases. – deschen Nov 21 '21 at 17:57
  • 1
    Also please share: https://stackoverflow.com/help/minimal-reproducible-example – deschen Nov 21 '21 at 18:21
  • Complementing @deschen comment--in addition to scannign for e.g. "street", "st", "ave", you might require that the string includes a number. See [this question](https://stackoverflow.com/questions/33393112/check-if-a-string-contains-at-least-one-numeric-character-in-r) for examples of how to do that. – Jessica Burnett Nov 24 '21 at 22:35

0 Answers0