0

I want to know how to combine unique text from different cells in a data frame in R.

I'm using the mtcars data. The data frame below contains information about IVs and DVs for stats tests.

> ## sets up data
> df1 <- 
+   data.frame(
+     test_number = c(1, 2, 3, 4),
+     DV = c("mpg", "mpg", "disp", "disp"),
+     IV_list = c("hp", "drat", "hp + wt", "hp + qsec")
+   )
> df1
  test_number   DV   IV_list
1           1  mpg        hp
2           2  mpg      drat
3           3 disp   hp + wt
4           4 disp hp + qsec

I want to colapse the IV_list variable, based on the unique values of the DV variable. I can do this using the manual input method:

> df2 <- 
+   data.frame(
+     test_number = c(5, 6),
+     DV = c("mpg", "disp"),
+     IV_list = c("hp + drat", "hp + wt + qsec")
+   )
> df2
  test_number   DV        IV_list
1           5  mpg      hp + drat
2           6 disp hp + wt + qsec

Is there a way to use code to create df2 from df1 without resorting to manual input?

Mel
  • 510
  • 3
  • 10
  • 1
    First [Split comma-separated strings in a column into separate rows](https://stackoverflow.com/questions/13773770/split-comma-separated-strings-in-a-column-into-separate-rows) (with appropriate separator). Then [Collapse / concatenate / aggregate a column to a single comma separated string within each group](https://stackoverflow.com/questions/15933958/collapse-concatenate-aggregate-a-column-to-a-single-comma-separated-string-w) on `unique` values. – Henrik Jul 06 '21 at 21:19
  • With some help from `dplyr` and `tidyr` you can do `df1 %>% group_by(DV) %>%tidyr::separate_rows(IV_list, sep=" \\+ ") %>% summarize(IV_list = paste(unique(IV_list), collapse =" + "))` – MrFlick Jul 06 '21 at 21:22

0 Answers0