How to do reverse of aggregate command on a dataframe?

Question

If I want to convert the following data.frame:

  >M
  name   ID
  a       1
  b,c     2
  d,e     3
  f       4

to this one:

  >M
 name  ID
 a     1
 b     2
 c     2
 d     3
 e     3
 f     4

How can I do this conversion for all elements of the first column?

Thanks

[one](http://stackoverflow.com/questions/29758504/split-data-frame-row-into-multiple-rows-based-on-commas), [two](http://stackoverflow.com/questions/37492809/add-new-line-in-df-using-grep-or-regex), [three](http://stackoverflow.com/questions/30525811/how-to-separate-comma-separated-values-in-r-in-a-new-row), [four](http://stackoverflow.com/questions/33113263/splitting-a-single-column-into-multiple-observation-using-r), [five](http://stackoverflow.com/questions/33571978/split-value-from-a-data-frame-and-create-additional-row-to-store-its-component) — rawr, May 31 '16 at 20:50

score 2 · Answer 1 · answered May 31 '16 at 19:51

You can use unnest() from tidyr:

library(dplyr); library(tidyr)
mutate(M, name = strsplit(name, ",")) %>% unnest(name)
Source: local data frame [6 x 2]

     ID  name
  (chr) (chr)
1     1     a
2     2     b
3     2     c
4     3     d
5     3     e
6     4     f

score 1 · Accepted Answer · answered May 31 '16 at 19:56

Here is a base R solution:

# split the names into a list
nameList <- strsplit(df$name, split=",")
# get your new data.frame
newdf <- data.frame(names=unlist(nameList), ID=rep(df$ID, sapply(nameList, length)))

This uses rep to repeat the ID the same number of times the names variable has been split. This means that it will work if you have 3 or more names as well.

data

df <- read.table(header=T, text="name   ID
  a       1
  b,c     2
  d,e     3
  f       4", stringsAsFactors=F)

output

> newdf
  names ID
1     a  1
2     b  2
3     c  2
4     d  3
5     e  3
6     f  4

How to do reverse of aggregate command on a dataframe?

2 Answers2