2

Consider the following data.frame:

df <- data.frame(ID = 1:2, Location = c("Love, Love, Singapore, Love, Europe, United States, Japan, Amazon, Seattle, Orchard Road, Love", 
                                        "Singapore, Singapore, Singapore") , stringsAsFactors = FALSE)

I would like to find out the Unique Data from the above mentioned df$Location column, that is I would like to obtain a new column, which consists of only the unique location names, exactly like the dataframe provided below;

df <- data.frame(ID = 1:2, Location = c("Love, Love, Singapore, Love, Europe, United States, Japan, Amazon, Seattle, Orchard Road, Love", 
                                        "Singapore, Singapore, Singapore") , 
                 Unique.Location = c("Love, Singapore, Europe, United States, Japan, Amazon, Seattle, Orchard Road",
                                     "Singapore"), stringsAsFactors = FALSE)

Any Inputs will be really appreciable.

Cettt
  • 11,460
  • 7
  • 35
  • 58
JBH
  • 101
  • 6
  • possible duplicate https://stackoverflow.com/questions/28033312/how-do-keep-only-unique-words-within-each-string-in-a-vector – Sotos Jul 26 '19 at 07:32

3 Answers3

4

In base R, we can split the string on comma, and paste only the unique string for each Location

df$unique.Location <- sapply(strsplit(df$Location, ","), function(x) 
                       toString(unique(trimws(x))))

Or another way using tidyr::separate_rows

library(dplyr)

df %>% 
  tidyr::separate_rows(Location, sep = ", ") %>%
  group_by(ID) %>%
  summarise(Unique.Location = toString(unique(Location)), 
            Location = toString(Location))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
3

you can use a combination of strsplit, sapply and unique:

df$Unique.Location <- sapply(strsplit(df$Location, split = ", "), function(x) paste0(unique(x), collapse = ", "))
Cettt
  • 11,460
  • 7
  • 35
  • 58
0

An option using tidyverse

library(dplyr)
library(purrr)
df %>% 
     mutate(unique.Location = str_extract_all(Location, "\\w+") %>%
          map_chr(~ toString(unique(.x))))
akrun
  • 874,273
  • 37
  • 540
  • 662