0

Although I see several similar issues on stackoverflow regarding this problem I cannot get my syntax to work. I want to split comma separated values into new columns in my dataframe. When I use the following syntax the resulting dataframe doe not make sense:

dat <- data.frame(ID = c(1:10),
                  var1 = rep(c("A","B"),5),
                  var2 = c(NA,"100,101,102","105","108,110","106","105,107,109,103","107,106",NA,"101",NA))

dat$var2 = as.character(dat$var2)

splitdat <- do.call(rbind, strsplit(dat$var2, split = ","))
splitdat <- data.frame(apply(splitdat, 2, as.numeric))

The section strsplit(dat$var2, split = ",") results in a correct list, but I can't add these values as new columns to my df.

Does anyone have the answer?

The desirde output (for the first 4 IDs) would be:

  ID var1 var2
1  1    A   NA
2  2    B  100
3  2    B  101
4  2    B  102
5  3    A  105
6  4    B  108
7  4    B  110
Joep_S
  • 481
  • 4
  • 22

1 Answers1

0

Look forward for a better answer but you could just use some base R to do the following:

reprowsby <- 
  rep(1:nrow(dat), lengths(regmatches(dat$var2, gregexpr(",", dat$var2))) + 1)

cbind(dat[reprowsby, -3], var2 = unlist(strsplit(dat$var2, ",")))

    ID var1 var2
1    1    A <NA>
2    2    B  100
2.1  2    B  101
2.2  2    B  102
3    3    A  105
4    4    B  108
4.1  4    B  110
5    5    A  106
6    6    B  105
6.1  6    B  107
6.2  6    B  109
6.3  6    B  103
7    7    A  107
7.1  7    A  106
8    8    B <NA>
9    9    A  101
10  10    B <NA>
s_baldur
  • 29,441
  • 4
  • 36
  • 69