0

I am fairly new to data.table. I still struggle with things that could be easily done with dataframes. I have succededed adding up the rows of several columns but I wonder if there is a less verbose way of going about it.

Data:

set.seed(85719)
DT <- data.table(
  t1 = rbinom(100,1,0.7),
  t2 = rbinom(100,1,0.1),
  a1 = rbinom(100,1,0.99),
  a2 = rbinom(100,1,0.31),
  l = letters,
  L = LETTERS
)

This is what I did (I have to admit I don't know what .() does and its hard to google just symbols):

DT[,.(
  "sumt1"=sum(t1),
  "sumt2"=sum(t2),
  "suma1"=sum(a1),
  "suma2"=sum(a2)
)]

and get what I want:

  sumt1 sumt2 suma1 suma2
1:    70    14    98    32

I am just wondering if there is a simpler way to obtain thet sum of several columns.

Zentaur
  • 31
  • 7
  • 1
    `colSums(DT[,t1:a2])` BTW `.()` is shorthand for `list()` inside a `data.table`. – jblood94 Jun 09 '23 at 14:28
  • 1
    With a data frame I'd do `cols = c("t1", "t2", "a1", "a2")` and then `colSums(DF[cols])`. This translates to `colSums(DT[, ..cols])` in data.table. – Gregor Thomas Jun 09 '23 at 14:28
  • 1
    You could also do `DT[, lapply(.SD, sum), .SDcols = cols]` if you were using a more complicated function that doesn't have a highly optimized `colFun()` variant. – Gregor Thomas Jun 09 '23 at 14:31
  • 1
    I'd suggest reading the excellent [Introduction to data.table vignette](https://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.html) (or re-reading it as a refresher). This question is an almost verbatim example with `mean` instead of `sum` in the section **e) Multiple columns in `j` - `.SD`** (though they demonstrate a grouped example - you can always omit the `.by` to do something to the whole data table instead of by group.) – Gregor Thomas Jun 09 '23 at 14:33
  • Thanks for all the suggestions. Specially the vignette was really useful. Several of you recommend `colSums()` which I tried at some point. However the result is not a data.table and when I coerced it to be a datata.table doing `data.table(colSums)` I end up with one column of four observations instead of four columns with one observations which is what I want. – Zentaur Jun 09 '23 at 15:11

0 Answers0