0

This has to be an incredibly easy task (and yeah this will probably be marked as a duplicate) but I can't find ANYWHERE how to do this simply within a dataframe without creating lists from the columns and putting them back. Reproducible code below:

I simply wish to separate the last element in the column df, delimited by comma:

df<- c("Lagos, Nigeria", "United States", "Buckingham Palace, Great Britain", 
   "Madison Square Garden, NY, New York, USA")
df <- data.frame(df, c(1:length(df)), stringsAsFactors = FALSE)

df$column.desired <- c("Nigeria", "United States", "Great Britain", 
              "USA")            
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Neal Barsch
  • 2,810
  • 2
  • 13
  • 39

1 Answers1

1

We could use sub to match characters (.*) until a comma , followed by zero or more space (\\s*), followed by one or more characters that are not a , ([^,]+) until the end ($) of the string, capture as a group ((...)) and replace with the backreference (\\1) of the captured group

df$column.desired <-sub(".*,\\s*([^,]+)$", "\\1", df$df)
df$column.desired
#[1] "Nigeria"       "United States" "Great Britain" "USA"   
akrun
  • 874,273
  • 37
  • 540
  • 662