0

I have the problem to merge certain characters of a group into a new column, e.g.

df = read.table(text="ID    Class
1    a
1    b
2    a
2    c
3    b
4    a
4    b
4    c", header=T)`

and the output should be something like

ID    Class Class.aggr
1      a     a, b
1      b
2      a     a, c
2      c
3      b     b
4      a     a,b,c
4      b
4      c`

I thought about using cat(union), but the data sample size is very high and I don't know how to call the Class characters dependent on the ID (tapply doesn't seem to work).

Will
  • 11,276
  • 9
  • 68
  • 76

2 Answers2

1

This is a possible approach with dplyr

Create data.frame:

ID <- c(1,1,2,2,3,4,4,4)
Class <- c("a","b","a", "c", "b", "a", "b", "c")
df <- data.frame(ID,Class)

And then:

require(dplyr)

df <- df %.% 
  group_by(ID) %.%                                  #group by ID
  mutate(count = 1:n()) %.%                         #count occurence per ID
  mutate(Class.aggr = paste(Class,collapse=","))    #paste the "Class" objects in a new column

df$Class.aggr[df$count>1] <- ""   #delete information from other rows  
df$count <- NULL                  #delete column with counts

#>df
#  ID Class Class.aggr
#1  1     a       a, b
#2  1     b           
#3  2     a       a, c
#4  2     c           
#5  3     b          b
#6  4     a    a, b, c
#7  4     b           
#8  4     c           
talat
  • 68,970
  • 21
  • 126
  • 157
1

Here's a solution using base functions

df$class.arg<-""
df$class.arg[!duplicated(df$ID)]<-
    tapply(df$Class, factor(df$ID,unique(df$ID)), paste, collapse=",")

which also produces

  ID Class class.arg
1  1     a       a,b
2  1     b          
3  2     a       a,c
4  2     c          
5  3     b         b
6  4     a     a,b,c
7  4     b          
8  4     c 
MrFlick
  • 195,160
  • 17
  • 277
  • 295