I'm trying to replicate the aggregate()
base function with data.table
syntax in this particular scenario:
# make it reproducible
set.seed(16)
# create data.table
DT <- data.table(source = sample(letters, 100, replace = TRUE), target = sample(LETTERS, 100, replace = TRUE))
# source target
# 1: j J
# 2: d K
# 3: w L
# 4: g J
# ...
# aggregate using base function
aggregate(list(target = DT$target), by = list(source = DT$source), FUN = function(x) paste(x, sep = ", "))
# source target
#1 a L, W, S, W
#2 b V, H, R, J, G, W, N
#3 c Y, C, I, K
#4 d K, A, P, V
# ...
I tried a couple of things using the data.table syntax but I didn't get it to work:
DT[, .(target = paste(target, sep = ", ")), by = source]
# source target
# 1: r P
# 2: r I
# 3: r Y
# 4: r G
# ...
DT[, target := paste(target, sep = ", "), by = source]
# source target
# 1: r P
# 2: g C
# 3: l U
# 4: f J
# ...
What's the right way to do this?
Bonus points: remove duplicate LETTERS
in output (i.e.: row 1 should be L, W, S
, not L, W, S, W
)
Thanks!