0

I have 2 columns ID and Product:

ID  Product
A   Clothing
B   Food
A   Food
A   Furniture
C   Food
B   Clothing

How can I create a dataframe using R where each ID will have the products as comma separated as below:

ID  Product
A   Clothing, Food, Furniture
B   Food, Clothing
C   Food, Clothing
Madhumita
  • 489
  • 1
  • 7
  • 15

1 Answers1

3

We can either use one of the group by functions. With data.table, we convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'ID', we paste the elements of 'Product' together. The toString is a wrapper for paste(., collapse=', ').

library(data.table)
setDT(df1)[,list(Product=toString(Product)), by = ID]

A similar option with dplyr is

library(dplyr
df1 %>%
   group_by(ID) %>%
   summarise(Product= toString(Product))

Or we can do with aggregate from base R

aggregate(Product~ID, df1, FUN=toString)
#    ID                   Product
#  1  A Clothing, Food, Furniture
#  2  B            Food, Clothing
#  3  C                      Food
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you. All 3 ways worked fine but leaving me with another challenge. There are duplicate product values for each ID. How do I get rid of them, see below : ID Product A Clothing, Clothing Food, Furniture, Furniture B Food,Food,Food, Clothing C Food, Clothing, Clothing – Madhumita Feb 09 '16 at 07:09
  • @Madhumita Please check the edited comment with the code. I think it should solve it. – akrun Feb 09 '16 at 07:15
  • I tried unique before posing this comment. It didn't work. I have upvoted your answer. – Madhumita Feb 09 '16 at 07:22
  • @Madhumita Can you update your post with the example. It is difficult to copy/paste the example from the comment. – akrun Feb 09 '16 at 07:23
  • I'll put it as a separate question. Cannot post a question within 90 minutes. You were very helpful. Thanks once again. – Madhumita Feb 09 '16 at 07:39