How to calculate means for multiple groups of rows in one dataframe

Question

I am working with a dataset that contains data for 30 NBA teams and a statistic that has been calculated for each game played in the 2014-2015 season. Here is a small sample of my data:

data <- structure(list(Team = c("ATL", "ATL", "ATL", "ATL", "ATL", "BOS", 
  "BOS", "BOS", "BOS", "BOS"), Date = structure(c(16372L, 16375L, 
    16379L, 16381L, 16382L, 16372L, 16375L, 16377L, 16379L, 16381L
  ), class = c("IDate", "Date")), Stat = c(2.77833333, 2.225, 2.68166667, 
    -1.56833333, 2.09333333, 4.7, 4.12166667, 1.46833333, -0.155, 
    0.405)), row.names = c(NA, -10L), class = "data.frame")

data
#>    Team       Date      Stat
#> 1   ATL 2014-10-29  2.778333
#> 2   ATL 2014-11-01  2.225000
#> 3   ATL 2014-11-05  2.681667
#> 4   ATL 2014-11-07 -1.568333
#> 5   ATL 2014-11-08  2.093333
#> 6   BOS 2014-10-29  4.700000
#> 7   BOS 2014-11-01  4.121667
#> 8   BOS 2014-11-03  1.468333
#> 9   BOS 2014-11-05 -0.155000
#> 10  BOS 2014-11-07  0.405000

^{Created on 2021-06-03 by the reprex package (v2.0.0)}

What I want to do is calculate the mean of all the statistics for each individual team in the dataset so each team will only have one row, with the state column having one mean number for each team so I am able to have one data point on a plot for each team. I was thinking about using dplyr summarise but I am not sure if that's the easiest way to do it. Any help is appreciated!

Does this answer your question? [Aggregate / summarize multiple variables per group (e.g. sum, mean)](https://stackoverflow.com/questions/9723208/aggregate-summarize-multiple-variables-per-group-e-g-sum-mean) — user438383, Jun 03 '21 at 10:00

score 2 · Accepted Answer · answered Jun 03 '21 at 06:38

Here is one potential solution using the tidyverse:

library(tidyverse)

dat1 <- read.table(text = "Team  Date          Stat
ATL   2014-10-29    2.77833333
ATL   2014-11-01    2.22500000
ATL   2014-11-05    2.68166667
ATL   2014-11-07    -1.56833333
ATL   2014-11-08    2.09333333
BOS   2014-10-29    4.70000000
BOS   2014-11-01    4.12166667
BOS   2014-11-03    1.46833333
BOS   2014-11-05    -0.15500000
BOS   2014-11-07    0.40500000",
                   header = TRUE)
dat2 <- dat1 %>% 
  group_by(Team) %>% 
  summarise(mean = mean(Stat))

dat2
# A tibble: 2 x 2
#  Team   mean
#  <chr> <dbl>
#1 ATL    1.64
#2 BOS    2.11

score 2 · Answer 2 · answered Jun 03 '21 at 06:41

2

Base R solution

aggregate(data$Stat, by = list(data$Team),
  FUN = mean)
#>   Group.1     x
#> 1     ATL 1.642
#> 2     BOS 2.108

^{Created on 2021-06-03 by the reprex package (v2.0.0)}

answered Jun 03 '21 at 06:41

Sinh Nguyen

4,277
3
18
26

score 1 · Answer 3 · answered Jun 03 '21 at 06:49

base R

tapply(dat1$Stat, dat1$Team, mean)

aggregate(Stat ~ Team, dat1, mean ) # already provided by Sinh Nguyen

Output:

> tapply(dat1$Stat, dat1$Team, mean)
  ATL   BOS 
1.642 2.108 
> aggregate(Stat ~ Team, dat1, mean )
  Team  Stat
1  ATL 1.642
2  BOS 2.108

How to calculate means for multiple groups of rows in one dataframe

3 Answers3