Grouped column sums (preferably with dplyr)

Question

I am trying to produce a col sum where there are two grouping variables: p_id and stimuli_length.

    df <- structure(list(p_id = c("p_id3", "p_id3", "p_id3", "p_id3", "p_id3", 
"p_id3", "p_id3", "p_id3", "p_id3", "p_id4", "p_id4", "p_id4", 
"p_id4", "p_id4", "p_id4", "p_id4", "p_id4", "p_id4", "p_id4", 
"p_id4", "p_id5", "p_id5", "p_id5", "p_id5"), stimuli_length = c(4L, 
4L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 4L, 4L, 5L, 5L, 6L, 
6L, 6L, 6L, 7L, 7L, 7L, 7L), value = c(1, 1, 1, 1, 0, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1), sum = c(2, 2, 
2, 2, 3, 3, 3, 3, 1, 3, 3, 3, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 
3), expected_result = c(2, 2, 2, 2, 3, 3, 3, 3, 1, 3, 3, 3, 2, 
2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3)), row.names = c(NA, -24L), class = c("tbl_df", 
"tbl", "data.frame"))

For each p_id, and each stimuli_length, I would like the sum of the value column.

So, for p_id3 and stimuli_length == 4, the sum is 1 + 1 = 2.

My attempt does not give the correct sum:

res <- df %>% group_by(p_id) %>% group_by(stimuli_length) %>%
  select(value) %>% rowwise() %>% mutate(sum = sum(value))

score 1 · Accepted Answer · answered Jun 20 '21 at 22:18

1

We don't need rowwise - as rowwise does a grouping by each row and there is only one observation when we do the sum. Instead, do a single group_by expression by adding the 'p_id', 'stimuli_length' and mutate directly to get the sum of 'value'

library(dplyr)
df %>%
      group_by(p_id, stimuli_length) %>%
      mutate(Sum = sum(value)) %>%
      ungroup

answered Jun 20 '21 at 22:18

akrun

874,273
37
540
662

The result is not correct, please check against the expected values. – eartoolbox Jun 20 '21 at 22:23
@eartoolbox rows 9 to 12 is not matching. I didn't understand your logic for that part – akrun Jun 20 '21 at 22:24
1

My apologies, you are quite right, there was an error in my manual result. This works. Thank you. I updated my post with the correct example. – eartoolbox Jun 20 '21 at 22:28

Grouped column sums (preferably with dplyr)

1 Answers1