I have a dataset of NBA players and their performance statistics such as games played, points per season, etc. I have multiple seasons for each player and a column Season.End, which indicates the year in which the data is from. The column G indicates the games played for that particular season.
Season.End Player Pos Age Tm G GS MP PER OWS DWS WS
<int> <chr> <fct> <int> <fct> <int> <int> <int> <dbl> <chr> <dbl> <chr>
1 2001 A.C. ~ PF 37 MIA 82 1 1411 11.2 1.1 2.1 3.2
2 2002 A.J. ~ PG 23 CHI 45 6 607 11.8 0 0 0.1
3 2001 A.J. ~ PG 22 CHI 33 8 630 10.3 0.4 -0.1 0.3
4 2003 A.J. ~ PG 24 GSW 2 0 9 -9 -0.1 0 -0.1
5 2017 A.J. ~ C 24 DAL 22 0 163 8.4 -0.2 0.2 0
6 2013 A.J. ~ PG 26 WAS 57 22 1278 12.4 1 1.2 2.2
7 2010 A.J. ~ PG 23 IND 56 2 865 14 0.4 0.8 1.2
8 2011 A.J. ~ PG 24 IND 50 0 795 10.7 -0.4 0.7 0.3
9 2012 A.J. ~ PG 25 IND 44 1 568 11.5 0.2 0.5 0.7
10 2014 A.J. ~ SG 27 MIN 28 0 99 9.7 -0.1 0 0
I would like to add a column that sums up the number of games that a player has played in previous seasons, so before the season. Ideally, it would look like something like this:
Season.End Player Pos Age Tm G total_games
2001 A.J. PG 22 CHI 33 0
2002 A.J. PG 23 CHI 45 33
2003 A.J. PG 24 GSW 2 78
2014 B.K. C 27 HOU 50 0
2015 B.K. C 28 HOU 74 50
2016 B.K. C 29 HOU 66 124
I tried multiple variations of using dplyr function such as group_by, summarise, etc. and looked into transforming my Season.End column into a date format, but I never managed to get the desired results.
Does anybody have a solution for my problem?