Equivalent of sql's SUM & GROUP BY on R data.table/data.frame?

Asked Feb 16 '17 at 21:15

Active Feb 16 '17 at 21:21

Viewed 30 times

Suppose I have the following data.frame/data.table object:

foo <- data.frame(sku = c('123','234','567'), shipped_units_wk1 = c(20,10,10), shipped_units_wk2 = c(25,25,50), category = c("AA", "AA", "BB"))

sku     shipped_units_wk1    shipped_units_wk2   category
123     20                   25                  AA
234     10                   25                  AA
567     10                   50                  BB

I need it in the following format:

category    shipped_units_wk1    shipped_units_wk2
AA          30                   50
BB          10                   50

I can use the sqldf package to accomplish this, but is there a faster method in a different package for this? My actual data.frame has tens of thousands of rows & 20 columns.

edited Feb 16 '17 at 21:20

Gregor Thomas

136,190
20
167
294

asked Feb 16 '17 at 21:15

Ray

3,137
8
32
59

https://github.com/Rdatatable/data.table/wiki/Benchmarks-%3A-Grouping The relevant one for your case looks like `lapply(.SD, sum) ...` – Frank Feb 16 '17 at 21:17
1

@Gregor I'm switching the dupe to the one for multiple vars. – Frank Feb 16 '17 at 21:21

Equivalent of sql's SUM & GROUP BY on R data.table/data.frame?

0 Answers0