How can I aggregate data with sum, mean and count for each column respectively?

Question

I have a dataset called "order_product" like this:

order_id   product  order_sequence  reorder
    1        egg          1            1
    1        meat         2            0
    1        fruit        3            1
    1        meat         4            1
    2        egg          1            1
    2        egg          2            1
    2        fruit        3            0
    3        egg          1            0
    3        fruit        2            1
    3        fruit        3            1

and I am going to aggregate the data into a new data frame called "product" which is grouped by product. The variables of new aggregated dataset shows the total frequency, the reorder rate and mean sequence of each product. The calculation of each variable are as follows:

frequency: product count
reorder_rate: sum of reorder/frequency 
mean_sequence: sum or order_sequence/frequency

So the result should be like this:

product  frequency reorder_rate  mean_sequence
  egg        4          3/4           5/4
  meat       2          1/2            3
 fruit       4          3/4           11/4

Can anybody help me with this in R? I tried the melt() function in package data.table, but I don't know how to code it.

Ronak Shah · Accepted Answer · 2019-08-10T10:41:40.207

Such calculations are easy using dplyr

library(dplyr)

df %>%
  group_by(product) %>%
  summarise(frequency = n(), 
            reorder_rate = sum(reorder)/frequency, 
            mean_sequence = sum(order_sequence)/frequency)

# A tibble: 3 x 4
#  product frequency reorder_rate mean_sequence
#  <fct>       <int>        <dbl>         <dbl>
#1 egg             4         0.75          1.25
#2 fruit           4         0.75          2.75
#3 meat            2         0.5           3

However, you can also use data.table

library(data.table)

setDT(df)[, .(frequency = .N, reorder_rate = sum(reorder)/.N, 
              mean_sequence = sum(order_sequence)/.N), by = product]

How can I aggregate data with sum, mean and count for each column respectively?

1 Answers1