0

I want to group my data by one column and paste the character strings from a different column into a single row. Suppose, for example, I have a data.frame A:

library(dplyr)
A <- data.frame(student = rep(c("John Smith", "Jane Smith"), 3),
                variable1 = rep(c("Var1", "Var1", "Var2"), 2))
    A <- arrange(A, student)

     student variable1
1 Jane Smith      Var1
2 Jane Smith      Var1
3 Jane Smith      Var2
4 John Smith      Var1
5 John Smith      Var2
6 John Smith      Var1

But, I need to transform data.frame A into data.frame B, grouped by the student variable and pasting any variations from variable1 together:

B <- data.frame(student = c("John Smith", "Jane Smith"), 
                variable1 = c(paste("Var1", "Var2", sep = ","),     
                              paste("Var1", "Var2", sep = ",")))

     student variable1
1 John Smith Var1,Var2
2 Jane Smith Var1,Var2

I've attempted numerous group_by and mutate clauses from the dplyr package but haven't found success.

Richard Erickson
  • 2,568
  • 8
  • 26
  • 39
David Ranzolin
  • 913
  • 1
  • 6
  • 18

2 Answers2

2

You can use the data.table package to do this easily, and quickly if you set student to be your key:

library(data.table)
A<-data.table(A)
setkey(A, student)
B<-A[, paste(unique(variable1), collapse=", "),by=student]
1

I believe you can use the aggregate function to do what you are looking for. Is this what you are trying to do?

df=unique(A)
agg=aggregate(df$variable1, list(df$student), paste, collapse=",")

> agg
             Group.1         x
        1 Jane Smith Var1,Var2
        2 John Smith Var1,Var2
Justin Klevs
  • 651
  • 6
  • 17