I have long data that is students nested within classrooms. I would like to calculate various class-level statistics for each student about the classroom that they study in, but exclude the student's own data in this calculation.
A simple example would be as below:
df <- data.frame(
class_id = c(rep("a", 6), rep("b", 6)),
student_id = c(rep(1, 3), rep(2, 2), rep(3, 1), rep(4, 2), rep(5, 3), rep(6, 1)),
value = rnorm(12)
)
As shown above, I six students in two classrooms, each of which has one or more observations of value. It's easy to get the student-level average with:
df %>%
group_by(class_id, student_id) %>%
summarize(value = mean(value))
or to add a classroom-level average with:
df %>%
group_by(class_id) %>%
mutate(class_avg = mean(value))
but I can't figure out how to tell dplyr to "leave out" a given group in the higher-level group level calculation. This is similar to the question asked here, but that calculates the mean of all groups except for the given group. I'm not sure how to modify this with dplyr to get what I want.
Thanks for your help.
Edit: After @akrun's request, the expected output is below (using a slightly modified version of @jared_mamrot's answer). As you can see, the class_mean_othstudents variable takes the value of the mean of the students in each class except for the given student. Jared's solution works but is a very manual approach and would only apply to getting a mean value. I am wondering if there is a dplyr way to do this more generally.
set.seed(123)
df <- data.frame(
class_id = c(rep("a", 6), rep("b", 6)),
student_id = c(rep(1, 3), rep(2, 2), rep(3, 1), rep(4, 2), rep(5, 3), rep(6, 1)),
value = rnorm(12)
)
df %>%
group_by(class_id, student_id) %>%
summarize(student_mean = mean(value)) %>%
mutate(class_mean_othstudents =
(sum(student_mean) - student_mean)/(n() - 1)
)
`summarise()` has grouped output by 'class_id'. You can override using the `.groups` argument.
# A tibble: 6 x 4
# Groups: class_id [2]
class_id student_id student_mean class_mean_othstudents
<chr> <dbl> <dbl> <dbl>
1 a 1 0.256 0.907
2 a 2 0.0999 0.986
3 a 3 1.72 0.178
4 b 4 -0.402 0.195
5 b 5 0.0305 -0.0211
6 b 6 0.360 -0.186