I do have a little twist in my head. I think this should be somewhat easy, but I just can't figure it out. I have this data:
tipologia date_info n
1 Aree soggette a crolli/ribaltamenti diffusi day 113
2 Aree soggette a crolli/ribaltamenti diffusi month 59
3 Aree soggette a crolli/ribaltamenti diffusi no date 506
4 Aree soggette a crolli/ribaltamenti diffusi year 1880
5 Aree soggette a frane superficiali diffuse day 24
6 Aree soggette a frane superficiali diffuse month 7
7 Aree soggette a frane superficiali diffuse no date 148
8 Aree soggette a frane superficiali diffuse year 142
9 Aree soggette a sprofondamenti diffusi day 1
10 Aree soggette a sprofondamenti diffusi no date 1
11 Aree soggette a sprofondamenti diffusi year 2
12 Colamento lento day 25
13 Colamento lento month 12
14 Colamento lento no date 27
15 Colamento lento year 177
16 Colamento rapido day 64
17 Colamento rapido month 3
18 Colamento rapido no date 12
19 Colamento rapido year 92
20 Complesso day 107
21 Complesso month 23
22 Complesso no date 150
23 Complesso year 138
What I want to do now is to sum up all values in the column "n" for each group in tipologia. But I dont want to lose the information in "date_info". So I basically just want to append a column that for the first group "Aree soggette a crolli/ribaltamenti diffusi" would have the value (113+59+506+1880 =2556) in the first four rows.
So I tried something like
df %>% count(tipologia, date_info) %>%
group_by(tipologia) %>%
summarise(total = sum(n))
but then I obviously "loose" my "date_info" column.
tipologia total
<chr> <int>
1 Aree soggette a crolli/ribaltamenti diffusi 2558
2 Aree soggette a frane superficiali diffuse 321
3 Aree soggette a sprofondamenti diffusi 4
4 Colamento lento 241
5 Colamento rapido 171
6 Complesso 418
7 Crollo/Ribaltamento 2932
8 DGPV 50
When I group by tipologia and date_info and then sum up n, it does not build the sum for some reason
df %>% count(tipologia, date_info) %>%
group_by(tipologia, date_info) %>%
summarise(total = sum(n))
And the result looks like
tipologia date_info total
<chr> <chr> <int>
1 Aree soggette a crolli/ribaltamenti diffusi day 113
2 Aree soggette a crolli/ribaltamenti diffusi month 59
3 Aree soggette a crolli/ribaltamenti diffusi no date 506
4 Aree soggette a crolli/ribaltamenti diffusi year 1880
5 Aree soggette a frane superficiali diffuse day 24
I think the answer might be somewhere in here too How to sum a variable by group, but I just can't figure it out...:/