How can I sum multiple columns by group please?
For the following data I can get the expected result using base r split-apply functions:
library(data.table)
set.seed(1234)
df1 = data.table(grp = sample(0:2, 10, TRUE),
a=sample(0:1, 10, TRUE), b=sample(0:1, 10, TRUE),
c=sample(0:1, 10, TRUE), d=sample(0:1, 10, TRUE))
df1
# grp a b c d
# 1: 1 1 1 1 0
# 2: 1 1 0 0 0
# 3: 0 0 1 1 0
# 4: 2 1 1 1 0
# 5: 0 1 1 0 1
# 6: 0 1 0 1 1
# 7: 1 1 1 0 1
# 8: 1 1 0 1 1
# 9: 2 1 0 1 1
# 10: 1 1 0 1 0
rbindlist(lapply(split(df1, df1$grp),
function(x) as.data.table(t(colSums(x[,-1])))), idcol="grp")
# grp a b c d
# 1: 0 2 2 2 2
# 2: 1 5 2 3 2
# 3: 2 2 1 2 1
How to do this with data.table
syntax?
These attempts do not give the expected outcome:
df1[, .(s = lapply(.SD, sum)), .SDcols=a:d, by=grp]
df1[, .(s = colSums(.SD)), .SDcols=a:d, by=grp]