0

new to R, am using it for some NFL analysis in a dataframe where the relevant columns look like this:

  1. Randy Moss 12.9 2000
  2. Randy Moss 21.6 2000
  3. Randy Moss 4.0 2000
  4. Randy Moss 44.7 2000
  5. Randy Moss 25.8 2000
  6. Randy Moss 12.9 2000

it's not a list, it's a dataframe where a player's ("fname.1") fantasy stats for each game ("fp3") and year of the game ("year") are the columns in question. This data includes all years from 2000-2019.

I want to add a column which is the mean of all fantasy results for that year for that player. So, my wanted output in the example data (if randy moss only played 6 games) would add a column of the mean for each entry, like this:

  1. Randy Moss 12.9 2000 16.98333
  2. Randy Moss 21.6 2000 16.98333
  3. Randy Moss 4.0 2000 16.98333
  4. Randy Moss 44.7 2000 16.98333
  5. Randy Moss 25.8 2000 16.98333
  6. Randy Moss 12.9 2000 16.98333

I'm having trouble using a simple group_by() and summarize() formula because of needing a different mean per player for each year. I wrote a for loop that creates a list with the information I need, but I'm not sure how to add that into the original data or if there's an easier way to accomplish this...

mean_fantasy <- list()
 for(y in 2000:2019) {
     mean_fantasy[[y]] <- offense_test %>%
         filter(year == y) %>%
         group_by(fname.1) %>%
         summarize(mean_fp3 = sum(fp3)/n(), games = n(), year = sum(year)/n())     
      }

Very new to R and this forum so hopefully this question/formatting makes sense

hotbacon
  • 69
  • 6

3 Answers3

0

We could use transmute with map

library(dplyr)
library(purrr)
library(stringr)
out <-  map_dfc(2000:2019, ~ offense_test %>%
                     filter(year == .x) %>%
                     group_by(fname.1) %>%
                     transmute(!! str_c('mean_fp3_', .x) :=  sum(fp3)/n(),
                               !! str_c('games_', .x) := n(), 
                               !! str_c('year_', .x)  := sum(year)/n())) %>%
        bind_cols(offense_test, .)

If we need a single mean column, then we don't need a loop, use 'year' also in the group_by and then create the column with mutate

offense_test %>%
     group_by(fname.1, year) %>%
     mutate(mean_fp3 = mean(fp3), games = n())
akrun
  • 874,273
  • 37
  • 540
  • 662
0

Just using the ave() function should give the result that you are looking for, giving the mean value per player per year.

   fp3 <- rnorm(20,20,5)
   player <- rep(c(LETTERS)[1:4], each = 5)
   year <- as.factor(rep(seq(2015,2016, by = 1), 10))

   df <- data.frame(player,fp3,year)

   df$mean.player.year <- ave(df$fp3, df[,c('player', 'year')], FUN = mean)

   # And for the desired output view...  
   df <- df[order(df$player,df$year),]

 > df
   player       fp3 year mean.player.year
1       A 20.658824 2015         14.36088
3       A 19.842985 2015         14.36088
5       A  2.580835 2015         14.36088
2       A 12.571649 2016         14.33038
4       A 16.089108 2016         14.33038
7       B 34.268847 2015         27.21018
9       B 20.151507 2015         27.21018
6       B  9.363759 2016         15.10290
8       B 19.686929 2016         15.10290
10      B 16.257998 2016         15.10290
11      C 25.823640 2015         21.57919
13      C 17.753304 2015         21.57919
15      C 21.160641 2015         21.57919
12      C 20.878661 2016         23.27219
14      C 25.665711 2016         23.27219
17      D 22.621288 2015         22.81370
19      D 23.006116 2015         22.81370
16      D 25.508619 2016         19.37231
18      D 13.923885 2016         19.37231
20      D 18.684435 2016         19.37231
Roasty247
  • 679
  • 5
  • 20
0

Thanks for the answers guys, went with Roasty's since it was simpler. Can verify it worked

hotbacon
  • 69
  • 6