2

I have data structured like this.

    structure(list(id = c("4031", "1040;2040;3040", "4040", 
    "1050;2050;3050"), description = c("Sentence A", 
    "Sentence B", "Sentence C", 
    "Sentence D")), row.names = 1:4, class = "data.frame")

              id description
1           4031  Sentence A
2 1040;2040;3040  Sentence B
3           4040  Sentence C
4 1050;2050;3050  Sentence D

I would like to restructure the data so that the ids with ";" are split into separate rows - I would like this:

structure(list(id = c("4031", "1040","2040","3040", "4040", 
"1050","2050","3050"), description = c("Sentence A", 
"Sentence B","Sentence B","Sentence B", "Sentence C", 
"Sentence D","Sentence D","Sentence D")), row.names = 1:8, class = "data.frame")

   id description
1 4031  Sentence A
2 1040  Sentence B
3 2040  Sentence B
4 3040  Sentence B
5 4040  Sentence C
6 1050  Sentence D
7 2050  Sentence D
8 3050  Sentence D

I know I can split the id column with strsplit but can't sort out an efficient way to convert that to rows without a loop

strsplit( as.character( a$id ) , ";" )
MatthewR
  • 2,660
  • 5
  • 26
  • 37

2 Answers2

2

Using R base:

> IDs <- strsplit(df$id, ";")
> data.frame(ID=unlist(IDs), Description=rep(df$description, lengths(IDs)))
    ID Description
1 4031  Sentence A
2 1040  Sentence B
3 2040  Sentence B
4 3040  Sentence B
5 4040  Sentence C
6 1050  Sentence D
7 2050  Sentence D
8 3050  Sentence D
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
1

One quite handy possibility with tidyr could be:

separate_rows(df, id)

    id description
1 4031  Sentence A
2 1040  Sentence B
3 2040  Sentence B
4 3040  Sentence B
5 4040  Sentence C
6 1050  Sentence D
7 2050  Sentence D
8 3050  Sentence D
tmfmnk
  • 38,881
  • 4
  • 47
  • 67