-2

I want to get the average score by Name going up to the date.

Here is an example dataframe

Date    Name  Score

1/1/16  Bill   5
1/1/16  Hank   5 
1/1/16  Aaron  10
1/2/16  Hank   15
1/2/16  Bill   10
1/3/16  Bill   20
1/3/16  Aaron  20
1/4/16  Aaron  20

How can I add this column?

Date    Name  Score Average_to_date

1/1/16  Bill   5     NA
1/1/16  Hank   5     NA
1/1/16  Aaron  10    NA
1/2/16  Hank   15    5
1/2/16  Bill   10    5
1/3/16  Bill   20    7.5
1/3/16  Aaron  20    10
1/4/16  Aaron  20    15
zx8754
  • 52,746
  • 12
  • 114
  • 209
user6452857
  • 117
  • 2
  • 9
  • 1
    Use any of the methods from this question: [How to average column values by group](http://stackoverflow.com/q/11562656/903061), but wherever you see the function `mean` use `cummean` instead. It's surprising that you consider the average of a single number to be undefined - you might have to post-process to set the first value to `NA` if that's really necessary. – Gregor Thomas Jan 27 '17 at 20:08
  • I do not consider the average of a simple number undefined. It is the average of 0 numbers. Thank you for the cummean comment. – user6452857 Jan 27 '17 at 20:33
  • You should describe want you wanted to the column to be then. Just saying "How can I add this column?" could have completely validly gotten the answer `df[["Average_to_date"]] <- c(NA, NA, NA, 5, 5, 7.5, 10, 15)`. – Barker Jan 27 '17 at 21:11

1 Answers1

2

If you don't mind using dplyr:

library(dplyr)

your_data %>%
  arrange(Date) %>%
  group_by(Name) %>%
  mutate(Average_to_date = lag(cummean(Score))

You can leave out the arrange if you are sure your data frame is already sorted by Date. Otherwise, please make sure your Date column is of Date class before arranging - see ?as.Date for details.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294