-1

I am trying to calculate the average or mean for each customerID. For the data below:

customerID <- c(1,1,1,1,2,2,2,2,3,3)

dates <- c(20130401, 20130403,  20130504,   20130508,   20130511,
       20130716,    20130719,   20130723,   20130729,   20130907)
cost <- c(12,  41,  89, 45.5,   32.89,  74, 76, 12, 15.78,  10)

data <- data.frame(customerID, dates,cost)

data$dates <- as.Date(as.character(data$dates), "%Y%m%d") 

# data2 <- aggregate(cbind(average_cost=cost) + customerID, data, mean) 

Data Looks like this:

customerID  dates   cost
1   20130401    12
1   20130403    41
1   20130504    89
1   20130508    45.5
2   20130511    32.89
2   20130716    74
2   20130719    76
2   20130723    12
3   20130729    15.78
3   20130907    10

How can I get an output similar to this? I can get the average for the whole data set, but not for each customer ID. Thanks!

customerID  average_cost
1           46.875
2           48.7225
3           12.89
sharp
  • 2,140
  • 9
  • 43
  • 80
  • @david arenburg. Yes, it is different. Previously, it was to find the total sum for each customer along with the dates. This question only involves customer and average for each customer. Dates do not matter here much as they did for the previous questions. The data set values appear to be similar but I am trying different scenarios. – sharp Feb 23 '15 at 14:59
  • This question is a specific case of the dupe. – David Arenburg Feb 23 '15 at 15:00
  • @DavidArenburg. Oh I see. Didn't see the link up on top. thanks. – sharp Feb 23 '15 at 15:04

1 Answers1

2

dplyr solution

library(dplyr)
df %>%
  group_by(customerID) %>%
  summarise(average_cost = mean(cost))

  customerID average_cost
1          1      46.8750
2          2      48.7225
3          3      12.8900

data.table solution

library(data.table)
dt <- as.data.table(df)
dt[, .(average_cost = mean(cost)), by=customerID]

Also if you just want base R

aggregate(cost ~ customerID, data=df, mean)
cdeterman
  • 19,630
  • 7
  • 76
  • 100