0

I have a data frame that looks as follows:

   WORD       CATEGORY             n
   <fct>       <fct>           <int>
 1 A            X                  4
 2 B            X                  3
 3 C            X                  6
 4 C            Y                  3
 5 D            X                  2
 6 E            X                  2
 7 F            Y                  2

I want to add a column sum that adds together values in the column n based on CATEGORY. So in rows 3 and 4, for instance, the value of the sum column would be 9.

Here is what the full dataset would look like:

   WORD       CATEGORY             n  sum
   <fct>       <fct>           <int>  <int>
 1 A            X                  4  4
 2 B            X                  3  3
 3 C            X                  6  9 
 4 C            Y                  3  9
 5 D            X                  2  2 
 6 E            X                  2  2
 7 F            Y                  2  2

How do I do this in the tidyverse?

Namenlos
  • 475
  • 5
  • 17

1 Answers1

0

If we count the number of unique values in CATEGORY and add it to the grouping variables we can directly sum up the n's:

dt %>%
 group_by(WORD) %>%
 mutate(uni=length(unique(CATEGORY))) %>%
 group_by(WORD,uni) %>%
 mutate(sum=sum(n)) %>%
 ungroup %>%
 select(-uni)
# A tibble: 7 x 4
  WORD  CATEGORY     n   sum
  <fct> <fct>    <int> <int>
1 A     X            4     4
2 B     X            3     3
3 C     X            6     9
4 C     Y            3     9
5 D     X            2     2
6 E     X            2     2
7 F     Y            2     2
Abdessabour Mtk
  • 3,895
  • 2
  • 14
  • 21