Repeat rows based on on a string and replace values of the new rows

Question

I'm having difficulties with this problem.

df <- data.frame(number=1:3,name=c('a','b','c'),code=c("5464","5463,5678","45363,6346,6435"))

If the 3rd column has only 1 set of numbers, don't repeat. For those which have > 1 set of numbers, repeat the row and replace with the subsequent number.

Original output

  number name    code
   1      a      "5464"
   2      b      "5463,5678"
   3      c      "45363,6346,6435"

Desired output

number name code
  1     a  "5464"
  2     b  "5463"
  2     b  "5678"
  3     c  "45363"
  3     c  "6346"
  3     c  "6435"

I really don't know where to start. I tried using stringr::str_split_fixed to separate the strings and count the number of occurrences. But after that I'm having difficulties in repeating the rows based on these occurrences and replacing with the corresponding value.

Any help is appreciated.

score 2 · Accepted Answer · answered Aug 24 '16 at 13:42

2

We can use separate_rows from tidyr

library(tidyr)
separate_rows(df, code)

Or cSplit from splitstackshape

library(splitstackshape)
cSplit(df, "code", ",", "long")

answered Aug 24 '16 at 13:42

akrun

874,273
37
540
662

The second one works fine! The first function doesn't seem to exist according to R documentation. – cimentadaj Aug 24 '16 at 13:51
@user3617958 It is in the new tidyr version i.e. 0.6.0. It works fine for me. Which version of tidyr you have. – akrun Aug 24 '16 at 14:02
Yeah, I'm using 0.4.1. Must be that. – cimentadaj Aug 24 '16 at 14:11

Repeat rows based on on a string and replace values of the new rows

1 Answers1