Easier option is separate_rows
library(tidyr)
library(dplyr)
out <- df %>%
separate_rows(y)
-output
out
# A tibble: 7 x 2
x y
<int> <chr>
1 1 a
2 1 b
3 1 c
4 2 c
5 2 d
6 3 f
7 3 g
If we need to do the reverse operation as well, there are couple of ways.
- From the output, do a group by + summarise operation
out %>%
group_by(x) %>%
summarise(y = toString(y), .groups = 'drop')
# A tibble: 3 x 2
x y
<int> <chr>
1 1 a, b, c
2 2 c, d
3 3 f, g
toString
adds a space after the ,
. If we don't want the space
, use paste(y, collapse=",")
or stringr::str_c(y, collapse=",")
- Another way is to replicate the column 'y' and select that column along with 'x' and get the
distinct
. This should also work when there is a single column i.e. no 'x' column (thus we don't know how it was split)
out <- df %>%
mutate(y1 = y) %>%
separate_rows(y1)
distinct(out, x, y)
# A tibble: 3 x 2
x y
<int> <chr>
1 1 a,b,c
2 2 c,d
3 3 f,g