1

How can I do what is being done below using data.table?

xy <- read.table(text = "Subject Product
1   ProdA
1   ProdB
1   ProdC
2   ProdB
2   ProdC
2   ProdD
3   ProdA
3   ProdB", header = TRUE)

aggregate(Product ~ Subject, FUN = paste, collapse = ";", data = xy)

Using dcast I get multiple columns with NA in some, but I need to collapse all columns for each subject using a "+" separator and removing the NA values.

dcast(xy, Subject ~ Product, value.var = "Product")
Roland
  • 334
  • 2
  • 21

1 Answers1

1

We do a group by summarise instead of dcast with data.table

library(data.table)
setDT(xy)[, .(Product = paste(Product, collapse=";")), by = Subject]
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Excellent, I had no idea it would be possible to do it that way. – Roland Jan 15 '20 at 19:11
  • With `data.table`, the syntax is `dt[i, j, by]` . We provide the row indexing/logical indexing etc in `i`, `by` for grouping and `j` for the summarisation of columns or creating new columns with `:=` – akrun Jan 15 '20 at 19:13