1

I am trying to produce a col sum where there are two grouping variables: p_id and stimuli_length.

    df <- structure(list(p_id = c("p_id3", "p_id3", "p_id3", "p_id3", "p_id3", 
"p_id3", "p_id3", "p_id3", "p_id3", "p_id4", "p_id4", "p_id4", 
"p_id4", "p_id4", "p_id4", "p_id4", "p_id4", "p_id4", "p_id4", 
"p_id4", "p_id5", "p_id5", "p_id5", "p_id5"), stimuli_length = c(4L, 
4L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 4L, 4L, 5L, 5L, 6L, 
6L, 6L, 6L, 7L, 7L, 7L, 7L), value = c(1, 1, 1, 1, 0, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1), sum = c(2, 2, 
2, 2, 3, 3, 3, 3, 1, 3, 3, 3, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 
3), expected_result = c(2, 2, 2, 2, 3, 3, 3, 3, 1, 3, 3, 3, 2, 
2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3)), row.names = c(NA, -24L), class = c("tbl_df", 
"tbl", "data.frame"))

For each p_id, and each stimuli_length, I would like the sum of the value column.

So, for p_id3 and stimuli_length == 4, the sum is 1 + 1 = 2.

My attempt does not give the correct sum:

res <- df %>% group_by(p_id) %>% group_by(stimuli_length) %>%
  select(value) %>% rowwise() %>% mutate(sum = sum(value))
eartoolbox
  • 327
  • 2
  • 10

1 Answers1

1

We don't need rowwise - as rowwise does a grouping by each row and there is only one observation when we do the sum. Instead, do a single group_by expression by adding the 'p_id', 'stimuli_length' and mutate directly to get the sum of 'value'

library(dplyr)
df %>%
      group_by(p_id, stimuli_length) %>%
      mutate(Sum = sum(value)) %>%
      ungroup
akrun
  • 874,273
  • 37
  • 540
  • 662