10

I want to perform group_by and do a string operation for a data frame using dplyr

df<-data.frame(varx=c("x1","x1","x2","x2","x2"),vary=c("y1","y2","y3","y4","y5"))

I want the output (newdf) to look like this:

newdf <- data.frame(varx=c("x1","x2"),catY=c("y1,y2","y3,y4,y5"))

I tried the following in dplyr

df %>% group_by(varx)%>%summarise(catY=paste(vary))
Error: expecting a single value

Also tried the following:

df %>% group_by(varx)%>%mutate(catY=paste(vary))

Source: local data frame [5 x 3]
Groups: varx

I can do it using basic data frame operation. Need help in understanding a way out in dplyr.

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Pradeep
  • 193
  • 1
  • 9
  • 4
    `df %>% group_by(varx)%>%summarise(catY=paste(vary, collapse = ","))`. `paste` gives you a vector so you need to `collapse` it into one dimensional character vector – David Arenburg Sep 11 '14 at 13:14
  • Thank you David, why don't you write this comment as an answer? So it will be better promoted. – Mert Nuhoglu Dec 04 '14 at 07:45

1 Answers1

10

The slightly shorter version of David's comment would be:

library(dplyr)
df %>% group_by(varx) %>% summarise(catY = toString(vary))

#Source: local data frame [2 x 2]
#
#  varx       catY
#1   x1     y1, y2
#2   x2 y3, y4, y5
talat
  • 68,970
  • 21
  • 126
  • 157