1

I am working on the below table in R and would like to add the last row containing the totals for each column. I used this code:

janitor::adorn_totals("row", fill = "-",na.rm = TRUE)

but get the sum of each column. However, I only want the sum of selected columns (population, cases, and deaths) but I also want to compute the total attack rate as (total cases/total population*10000) and the total CFR as (total deaths/total Cases*100). I would be grateful for your help with this as I am a new user of R.

tab <- read.table(sep = ";", dec = ".", check.names = FALSE, stringsAsFactors = FALSE, header = TRUE,
text = "
Area;  Population;  Cases;  Attack Rate (per 100000);  Deaths;  CFR (%)
A;     216767;        18;                       8.3;       0;        0
B;     181160;        17;                       9.4;       0;        0
C;     478978;       717;                      87.1;      16;      3.8
D;     446411;        13;                       2.9;       0;        0
E;     268476;         3;                       1.1;       0;        0
F;     494289;        45;                       9.1;       2;      4.4
G;     106303;         9;                       8.5;       0;        0
H;     173990;         8;                       4.6;       0;        0
")
akraf
  • 2,965
  • 20
  • 44

1 Answers1

0

Its easier to calculate the derived columns Attack Rate and CFR only after the sums have been calculated.

Then, you can use basic tidyverse functions to first generate a 1-row table with summaries only. You add this to the original data frame and calculate Attack rate and CFR last.

library(dplyr)
# Start with non-derived columns only
tab1 <- tab %>% select(Area, Population, Cases, Deaths)

# Summarise into a new data frame with one row
summ <- tab1 %>% 
  select(-Area) %>% # cannot sum() this column
  summarise_all(sum) %>% 
  mutate(Area = "Total")

# Add sums to the original data frame
tab <- bind_rows(tab1, summ)

# Calculate derived columns
tab <- tab %>%
  mutate(`Attack Rate` = Cases/Population*10000,
         `CFR (%)`     = Deaths/Cases*100)

Resulting tab:

   Area Population Cases Deaths Attack Rate  CFR (%)
1     A     216767    18      0   0.8303847 0.000000
2     B     181160    17      0   0.9383970 0.000000
3     C     478978   717     16  14.9693723 2.231520
4     D     446411    13      0   0.2912115 0.000000
5     E     268476     3      0   0.1117418 0.000000
6     F     494289    45      2   0.9103986 4.444444
7     G     106303     9      0   0.8466365 0.000000
8     H     173990     8      0   0.4597965 0.000000
9 Total    2366374   830     18   3.5074760 2.168675

However, I get other values than you for the derived columns...

akraf
  • 2,965
  • 20
  • 44
  • Thank you very much, Monica. My attack rate was computed by 100,000 and the number of cases for c is 414 instead of 717. This is why your results are slightly different from mine. However, the concept is very well understood. I tried it and it worked. – George Sie Williams Jan 04 '20 at 12:51