0

Here is my input: v1, v2, and df start out with the same length.

v1 = c("a", "b,c", "d")                                                                                                                                                                                                                                                                                                   
v2 = c("r", "s,t", "u")                                                                                                                                                                                                                                                                                                   
df = data.frame(id=1:3, name = c("X", "Y", "Z"))
> df
  id name
1  1    X
2  2    Y
3  3    Z

How do I accomplish this in R?

# expected output:
odf = data.frame(id=c(1,2,2,3), name = c("X", "Y", "Y", "Z"), v1 = c("a", "b", "c", "d"), v2 = c("r", "s", "t", "u"))
> odf
  id name v1 v2
1  1    X  a  r
2  2    Y  b  s
3  2    Y  c  t
4  3    Z  d  u

That means whenever there is a "compound" element in v1 or v2, explode it and duplicate values in other columns of the final data frame odf.

biocyberman
  • 5,675
  • 8
  • 38
  • 50

2 Answers2

1

tidyr's separate_rows.

First bind the columns, then split the rows. separate recognizes where the comma is, otherwise you can specify the separator as well.

library(tidyr)
library(dplyr)

df %>% bind_cols(v1 = v1, v2 = v2) %>% 
  separate_rows(v1, v2)
  id name v1 v2
1  1    X  a  r
2  2    Y  b  s
3  2    Y  c  t
4  3    Z  d  u
phiver
  • 23,048
  • 14
  • 44
  • 56
0

Using cSplit from splitstackshape

library(splitstackshape)
cSplit(cbind(df, v1, v2), c("v1", "v2"), sep= ",", "long")
#   id name v1 v2
#1:  1    X  a  r
#2:  2    Y  b  s
#3:  2    Y  c  t
#4:  3    Z  d  u
akrun
  • 874,273
  • 37
  • 540
  • 662