Here is a problem that I would like to solve using the tidyverse. It's difficult to explain but the example lays it out.
I have a dataset with several names. Each name is represented by 1 or more ID's (ranges from 1-5 in the real data). Each ID has 5 values, ordered by the draw number.
library(tidyverse)
set.seed(190)
dplyr::tibble(nms = c(rep("A", 5), rep("B",10), rep("C", 10)),
draw = c(rep(1:5, times = 5)),
id = c(rep(c("A1", "B1", "B2", "C1", "C2"), each = 5)),
val = c(sample(1:4, replace = T, size = 25))) %>% print(n=25)
#> # A tibble: 25 × 4
#> nms draw id val
#> <chr> <int> <chr> <int>
#> 1 A 1 A1 3
#> 2 A 2 A1 4
#> 3 A 3 A1 2
#> 4 A 4 A1 4
#> 5 A 5 A1 1
#> 6 B 1 B1 2
#> 7 B 2 B1 1
#> 8 B 3 B1 3
#> 9 B 4 B1 3
#> 10 B 5 B1 2
#> 11 B 1 B2 1
#> 12 B 2 B2 2
#> 13 B 3 B2 2
#> 14 B 4 B2 1
#> 15 B 5 B2 2
#> 16 C 1 C1 2
#> 17 C 2 C1 1
#> 18 C 3 C1 1
#> 19 C 4 C1 3
#> 20 C 5 C1 4
#> 21 C 1 C2 1
#> 22 C 2 C2 4
#> 23 C 3 C2 4
#> 24 C 4 C2 2
#> 25 C 5 C2 4
Created on 2023-05-31 with reprex v2.0.2
I would like to get the average of each draw for each name nms
. This means getting the average val
of each draw for every individual id
. Some names like A
only have one id, so the average val
, or mean_val
, will be the same, but since B
and C
have multiple id's, the mean_val
will be averaged for each draw. I would like the results to look like this
# A tibble: 15 × 3
# nms draw mean_val
# <chr> <int> <dbl>
# 1 A 1 3
# 2 A 2 4
# 3 A 3 2
# 4 A 4 4
# 5 A 5 1
# 6 B 1 1.5
# 7 B 2 1.5
# 8 B 3 2.5
# 9 B 4 2
#10 B 5 2
#11 C 1 1.5
#12 C 2 2.5
#13 C 3 2.5
#14 C 4 2.5
#15 C 5 4