-1

I have the dataframe below and I want to add 2 new columns. The first Cases1 will have the percentage of Cases for each row per Age group and the second Cases2 will have the percentage of Cases per all rows:

Cm<-structure(list(`Age group` = c("00-04", "00-04", "05-14", "05-14", 
"15-24", "15-24", "25-49", "25-49", "50-64", "50-64", "65-79", 
"65-79", "80+", "80+"), Gender = c("Female", "Male", "Female", 
"Male", "Female", "Male", "Female", "Male", "Female", "Male", 
"Female", "Male", "Female", "Male"), Cases = c(64578, 70518, 
187568, 197015, 414405, 388138, 1342394, 1206168, 792180, 742744, 
400232, 414613, 282268, 198026)), row.names = c(NA, -14L), groups = structure(list(
    `Age group` = c("00-04", "05-14", "15-24", "25-49", "50-64", 
    "65-79", "80+"), .rows = structure(list(1:2, 3:4, 5:6, 7:8, 
        9:10, 11:12, 13:14), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, 7L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

# A tibble: 14 x 3
# Groups:   Age group [7]
   `Age group` Gender   Cases
   <chr>       <chr>    <dbl>
 1 00-04       Female   64578
 2 00-04       Male     70518
 3 05-14       Female  187568
 4 05-14       Male    197015
 5 15-24       Female  414405
 6 15-24       Male    388138
 7 25-49       Female 1342394
 8 25-49       Male   1206168
 9 50-64       Female  792180
10 50-64       Male    742744
11 65-79       Female  400232
12 65-79       Male    414613
13 80+         Female  282268
14 80+         Male    198026
firmo23
  • 7,490
  • 2
  • 38
  • 114
  • Related: [Summarizing by subgroup percentage in R](https://stackoverflow.com/questions/27134516/summarizing-by-subgroup-percentage-in-r) and [Add a New Column to a Dataframe with the Percentage of Every Value](https://stackoverflow.com/questions/56606102/add-a-new-column-to-a-dataframe-with-the-percentage-of-every-value-of-this-dataf) – Ian Campbell May 27 '21 at 20:07
  • What have you tried so far? – camille May 27 '21 at 22:02

1 Answers1

2

I think this should take care of it for you:

library(dplyr)
Cm %>%
   group_by(`Age group`) %>%
   mutate(Cases1 = Cases/sum(Cases) * 100) %>%
   ungroup %>%
   mutate(Cases2 = Cases/sum(Cases) * 100)
## A tibble: 14 x 5
#   `Age group` Gender   Cases Cases1 Cases2
#   <chr>       <chr>    <dbl>  <dbl>  <dbl>
# 1 00-04       Female   64578   47.8  0.964
# 2 00-04       Male     70518   52.2  1.05 
# 3 05-14       Female  187568   48.8  2.80 
# 4 05-14       Male    197015   51.2  2.94 
# 5 15-24       Female  414405   51.6  6.18 
# 6 15-24       Male    388138   48.4  5.79 
# 7 25-49       Female 1342394   52.7 20.0  
# 8 25-49       Male   1206168   47.3 18.0  
# 9 50-64       Female  792180   51.6 11.8  
#10 50-64       Male    742744   48.4 11.1  
#11 65-79       Female  400232   49.1  5.97 
#12 65-79       Male    414613   50.9  6.19 
#13 80+         Female  282268   58.8  4.21 
#14 80+         Male    198026   41.2  2.96 
Ian Campbell
  • 23,484
  • 14
  • 36
  • 57