0

If I want to convert the following data.frame:

  >M
  name   ID
  a       1
  b,c     2
  d,e     3
  f       4

to this one:

  >M
 name  ID
 a     1
 b     2
 c     2
 d     3
 e     3
 f     4

How can I do this conversion for all elements of the first column?

Thanks

Zaynab
  • 233
  • 3
  • 16
  • 3
    [one](http://stackoverflow.com/questions/29758504/split-data-frame-row-into-multiple-rows-based-on-commas), [two](http://stackoverflow.com/questions/37492809/add-new-line-in-df-using-grep-or-regex), [three](http://stackoverflow.com/questions/30525811/how-to-separate-comma-separated-values-in-r-in-a-new-row), [four](http://stackoverflow.com/questions/33113263/splitting-a-single-column-into-multiple-observation-using-r), [five](http://stackoverflow.com/questions/33571978/split-value-from-a-data-frame-and-create-additional-row-to-store-its-component) – rawr May 31 '16 at 20:50

2 Answers2

2

You can use unnest() from tidyr:

library(dplyr); library(tidyr)
mutate(M, name = strsplit(name, ",")) %>% unnest(name)
Source: local data frame [6 x 2]

     ID  name
  (chr) (chr)
1     1     a
2     2     b
3     2     c
4     3     d
5     3     e
6     4     f
Psidom
  • 209,562
  • 33
  • 339
  • 356
1

Here is a base R solution:

# split the names into a list
nameList <- strsplit(df$name, split=",")
# get your new data.frame
newdf <- data.frame(names=unlist(nameList), ID=rep(df$ID, sapply(nameList, length)))

This uses rep to repeat the ID the same number of times the names variable has been split. This means that it will work if you have 3 or more names as well.

data

df <- read.table(header=T, text="name   ID
  a       1
  b,c     2
  d,e     3
  f       4", stringsAsFactors=F)

output

> newdf
  names ID
1     a  1
2     b  2
3     c  2
4     d  3
5     e  3
6     f  4
lmo
  • 37,904
  • 9
  • 56
  • 69