2

I'm a pretty beginner at R. I've a CSV file where data is as follows, for example:

ID  Values
820 D1,D2,FE
730 D1,D2,D3,PC,Io,He,Bt,Te,AR,PG
730 DV,GTH,LYT
567 EDR,TYU,EOP,OMN
567 FGH,KIH,IOP

I want to remove the duplicates in ID and append their data into its Values column, like this:

ID  Values
820 D1,D2,FE
730 D1,D2,D3,PC,Io,He,Bt,Te,AR,PG,DV,GTH,LYT
567 EDR,TYU,EOP,OMN,FGH,KIH,IOP

How to achieve this in R?

Jaap
  • 81,064
  • 34
  • 182
  • 193
LearneR
  • 2,351
  • 3
  • 26
  • 50

2 Answers2

3
dat <- read.table(text="ID  Values
820 D1,D2,FE
730 D1,D2,D3,PC,Io,He,Bt,Te,AR,PG
730 DV,GTH,LYT
567 EDR,TYU,EOP,OMN
567 FGH,KIH,IOP", header=TRUE)

dat2 <- dat %>% group_by(ID) %>% summarise(val=paste(Values, collapse=","))
Jaap
  • 81,064
  • 34
  • 182
  • 193
2

You can try

library(data.table)
setDT(df1)[, list(Values=paste(Values, collapse=",")) ,ID]

Or using base R

 aggregate(.~ID, df1, paste, collapse=",")
akrun
  • 874,273
  • 37
  • 540
  • 662