0

I have a data frame with football (soccer) stats with a gameID (which game it is), a leagueID (which of the Big Five leagues the game "belongs" to), goals scored and fouls commited. The problem is that the data frame contains each team, so the GameID is doubled. In order to get an outside view of the game result, I like to sum up the results to have a final result of one GameID.

I have this data frame:

GameID leagueID goals fouls
81 1 1 12
81 1 0 12
-------- -------- ------- ------
83 1 2 7
83 1 2 13
-------- -------- ------- ------
88 1 0 12
88 1 2 9

I would like to have this result:

GameID leagueID goals fouls
81 1 1 24
83 1 4 20
88 1 2 21

I tried to create new columns with "rowSums". My problem is that I need the sum for each GameID and no the sum of the whole rows.

TylerH
  • 20,799
  • 66
  • 75
  • 101

2 Answers2

1

You could use group_byand summarizeto get a summarised table.

library(dplyr)

data %>%  #the name of your dataframe
  group_by(GameID) %>%  
  summarise(leagueID  = mean(leagueID),
            goals = sum(goals),
            fouls = sum(fouls)) 
Liv
  • 36
  • 4
  • As evident from M aurélio's answer, you might want to add the leagueID to the grouping variable – Liv Dec 29 '22 at 14:13
0

You could easily do this with dplyr grouping by GameID and LeagueID

library(dplyr)

games <-
tribble(
     ~GameID,  ~leagueID,    ~goals,   ~fouls,
        81,        1,       1,     12,
        81,        1,       0,     12,
        83,        1,       2,      7,
        83,        1,       2,     13,
        88,        1,       0,     12,
        88,        1,       2,      9
  )

games |> 
  group_by(GameID,leagueID) |> 
  summarise(goals_sum = sum(goals), fouls_sum = sum(fouls))
M Aurélio
  • 830
  • 5
  • 13