0
sample_data <- data.frame(id = c("123abc", "def456", "789ghi", "123abc"),
                          some_str = c("carrots", "bananas", "apples", "cabbage"))

I would like to know how to wrangle sample df to be like this:

desired_df <- data.frame(id = c("123abc", "def456", "789ghi"),
                         some_str_concat = c("carrots, cabbage", "bananas", "apples"))

Each id may appear multiple times. In that case I would like to get the corresponding value from some_str and concatenate into a new feature, where the new df is grouped on id.

In the example above, id 123abc appears twice. First with a value of "carrots" and then again with a value of "apples". Thus, the desired data frame has a single row for abc123 with the value "carrots, cabbage".

How can I do this? Ideally within either base r or dplyr.

Doug Fir
  • 19,971
  • 47
  • 169
  • 299

1 Answers1

0
sample_data %>% 
+     group_by(id) %>% 
+     mutate(some_str = paste(some_str, collapse = ", ")) %>%
+     distinct()
bunbun
  • 2,595
  • 3
  • 34
  • 52
Lufy
  • 175
  • 1
  • 10