Means for previous values' scores

Question

I have kids' personal best times listed per week per stroke. For kid A, how can I calculate that kid's means for all previous weeks, excluding current week's results? For example, one kid's results might look like this:

df.trial <- data.frame(Week= c("w10", "w9", "w9", "w5", "w5", "w5", "w6", "w6", "w3"), Stroke= c("Fly","Free","Breast","Back","Free","Breast","Fly","Back","Free"),  Score = c(5.5,4.5,4.6,5.2, 4.3, 5.7, 4.7,5.5,4.8))

I'm trying to add a new column to calculate the means like this:

df.desired <- data.frame(Week= c("w10", "w9", "w9", "w5", "w5", "w5", "w6", "w6", "w3"), Stroke= c("Fly","Free","Breast","Back","Free","Breast","Fly","Back","Free"), Score = c(5.5, 4.5, 4.6, 5.2, 4.3, 5.7, 4.7, 5.5, 4.8), prev.mean = c(4.91, 5.03, 5.03, 4.80, 4.80, 4.80, 5.00, 5.00, NA))

The data I have has lots of kids so I will need to group by each student's name. Also, if there are no previous weeks then I'd like to add an NA.

I have tried a few different ways and have tried to write a function like the rollsum one described here: R previous week total

No luck so far. Any suggestions please?

It's helpful if you can include some of the code that hasn't worked, so we can help you improve upon it and see what direction you're going in. — camille, Mar 04 '19 at 15:47

Cettt · Accepted Answer · 2019-03-04T16:01:51.120

EDIT: new answer

I misunderstood your question, I think this is what you are looking for. Note that I added a column kid_id in your data.frame to show you a solution which is applicable for multiple ids.

Also note that I have to convert Weeks into numeric so I can arrange for Weeks, otherwise the arrangement won't be correct.

library(tidyverse)

df.trial <- data.frame(Kid_id = 1, Week= c("w10", "w9", "w9", "w5", "w5", "w5", "w6", "w6", "w3"), Stroke= c("Fly","Free","Breast","Back","Free","Breast","Fly","Back","Free"),  Score = c(5.5,4.5,4.6,5.2, 4.3, 5.7, 4.7,5.5,4.8))

df.trial %>% as_tibble() %>% 
  mutate(Week = as.numeric(gsub("w", "", as.character(Week))), 
         kid_id = 1) %>% 
  group_by(kid_id, Week) %>% 
  arrange(kid_id, Week) %>% 
  summarise(sum_score = sum(Score),
            n_score = n()) %>% 
  mutate(prev.mean = map_dbl(seq_along(sum_score), ~sum(sum_score[1:.x - 1]) / sum(n_score[1:.x - 1])),
         Week = paste0("w", Week)) %>% 
  ungroup() %>% 
  select(kid_id, Week, prev.mean) %>% 
  left_join(df.trial, ., by = "Week")


  Week Stroke Score kid_id prev.mean
1  w10    Fly   5.5      1  4.912500
2   w9   Free   4.5      1  5.033333
3   w9 Breast   4.6      1  5.033333
4   w5   Back   5.2      1  4.800000
5   w5   Free   4.3      1  4.800000
6   w5 Breast   5.7      1  4.800000
7   w6    Fly   4.7      1  5.000000
8   w6   Back   5.5      1  5.000000
9   w3   Free   4.8      1       NaN

Here you have NaNinstead of NA but they can easily be replaced if desired.

Thank you! That is great! I had tried dplyr but didn't come across the map_dbl. Your solution is spot on and works with multiple kids too when I change the kid_id = 1 to kid_id = kid_id. Thanks again for your help! — Leila, Mar 05 '19 at 09:51

Means for previous values' scores

1 Answers1