I am trying to do some aggregation using data.table
to get the mean and first value of some columns. For example:
dt <- data.table(mtcars)
dt[, .(disp = mean(disp, na.rm=T),
hp = mean(hp, na.rm=T)),
by=cyl]
out:
cyl disp hp
1: 6 183.3143 122.28571
2: 4 105.1364 82.63636
3: 8 353.1000 209.21429
desired:
cyl disp hp wt
1: 6 183.3143 122.28571 2.62
2: 4 105.1364 82.63636 2.32
3: 8 353.1000 209.21429 3.44
To extract the first row by itself, this can be done:
dt[, .SD[1], by=cyl][,.(cyl,wt)]
out:
cyl wt
1: 6 2.62
2: 4 2.32
3: 8 3.44
But how do I do it with the other aggregate functions?
In dplyr
, I will simply do:
mtcars %>% group_by(cyl) %>%
summarise(disp = mean(disp, na.rm=T),
hp = mean(disp, na.rm=T),
wt = first(wt))
out:
cyl disp hp wt
1 4 105.1364 105.1364 2.32
2 6 183.3143 183.3143 2.62
3 8 353.1000 353.1000 3.44