I have a data set with sales numbers from individual items ordered on various dates. All items in a particular order share the same ID. I want to calculate order totals (i.e. sum of sales for all items in a particular order), while preserving the date associated with each order (we can assume all items part of an order share the same date). How do I sum sales with respect to ID, while preserving the date?
This questions is different from others I've seen, because I want to preserve and collapse the Date column while summing with respect to a different column, Sales.
Columns before: Date
, ID
, Sales
Columns after: Date
, ID
, Order.Total
The following code returns an error because dates obviously cannot be summed :
df[, lapply(.SD, sum), by = "ID"]
The following code removes the Date field altogether :
df[, lapply(.SD, sum), by = "ID", .SDcols = !"Date"]
For example, if my data set before is :
DATE ID SALES
1/2 01 1
1/2 01 2
1/2 02 3
1/3 03 6
1/4 04 5
1/4 04 4
My data set after should be :
DATE ID ORDER.TOTAL
1/2 01 3
1/2 02 3
1/3 03 6
1/4 04 9