0

I have a large data frame that contains Address Information (CUST_ADDRESS_1 and CUST_ADDRESS_2).

CUST_ADDRESS_1 should only contain street information such as 123 Anywhere Drive while CUST_ADDRESS_2 should only contain Suite information such as Suite 23.

I want to find all of the instances where Suite information is located in CUST_ADDRESS_1 and place it in CUST_ADDRESS_2.

I'm okay if Suite information replaces current data in CUST_ADDRESS_2 but I only want the data replaced if it meets that condition.

For example:

BEFORE

CUST_ADDRESS_1                     CUST_ADDRESS_2
986 Eastern Drive                  Suite 180
763 Sunset Drive, Suite 2          Attn: Mark Matthews
543 Roanoke Lane
4201 Practice Road, Suite 18

AFTER

CUST_ADDRESS_1                     CUST_ADDRESS_2
986 Eastern Drive                  Suite 180
763 Sunset Drive                   Suite 2
543 Roanoke Lane
4201 Practice Road,                Suite 18

If tried the following but if Suite info is not found in CUST_ADDRESS_1 it deletes the data in CUST_ADDRESS_2.

RosterFinal$CUST_ADDRESS_2 <- if_else(grepl("SUITE",RosterFinal$CUST_ADDRESS_1),substr(RosterFinal$CUST_ADDRESS_1,(regexpr("SUITE", RosterFinal$CUST_ADDRESS_1)-1),nchar(RosterFinal$CUST_ADDRESS_1)),if_else(grepl(" STE",RosterFinal$CUST_ADDRESS_1),substr(RosterFinal$CUST_ADDRESS_1,(regexpr(" STE", RosterFinal$CUST_ADDRESS_1)-1),nchar(RosterFinal$CUST_ADDRESS_1)),if_else(grepl(" #",RosterFinal$CUST_ADDRESS_1),substr(RosterFinal$CUST_ADDRESS_1,(regexpr(" #", RosterFinal$CUST_ADDRESS_1)-1),nchar(RosterFinal$CUST_ADDRESS_1)),"")))
MrFlick
  • 195,160
  • 17
  • 277
  • 295
Will
  • 13
  • 4
  • Please do not post data as images. It makes it much more difficult to import into R. It is better to share data in a [reproducible format](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – MrFlick Jun 25 '18 at 15:04

2 Answers2

0
RosterFinal$CUST_ADDRESS_3 <- if_else(grepl("SUITE",RosterFinal$CUST_ADDRESS_1),substr(RosterFinal$CUST_ADDRESS_1,(regexpr("SUITE", RosterFinal$CUST_ADDRESS_1)-1),nchar(RosterFinal$CUST_ADDRESS_1)),if_else(grepl(" STE",RosterFinal$CUST_ADDRESS_1),substr(RosterFinal$CUST_ADDRESS_1,(regexpr(" STE", RosterFinal$CUST_ADDRESS_1)-1),nchar(RosterFinal$CUST_ADDRESS_1)),if_else(grepl(" #",RosterFinal$CUST_ADDRESS_1),substr(RosterFinal$CUST_ADDRESS_1,(regexpr(" #", RosterFinal$CUST_ADDRESS_1)-1),nchar(RosterFinal$CUST_ADDRESS_1)),"")))

RosterFinal=RosterFinal[,-"CUST_ADDRESS_2"]

colnames(RosterFinal)[colnames(RosterFinal)=="CUST_ADDRESS_3"]="CUST_ADDRESS_2"
0

This sounds like a standard R which problem. The code below definitely is not optimal, but it should give you an idea as to how to deal with these kinds of problems.

Try the following

RosterFinal[which(grep(", Suite ",RosterFinal$CUST_ADDRESS_1)==1),'CUST_ADDRESS_2'] <- "Suite "+strsplit(RosterFinal$CUST_ADDRESS_1,"Suite ")[[1]][2]
RosterFinal$CUST_ADDRESS_1 <- strsplit(RosterFinal$CUST_ADDRESS_1,", Suite")[[1]][1]
Rushabh Mehta
  • 1,529
  • 1
  • 13
  • 29
  • No problem! Make sure to upvote and select as best answer the answer you think best fits your problem, and is most useful to any future visitors to the site! – Rushabh Mehta Jun 27 '18 at 16:31