I have just set up my dataframe from census data and a health administrative database in long format - to get counts/denoms by subgroups.
Here is a sample of what it looks like
YR Region DI Cat Sex Age15cat n Age15Pop weight
2014 ON Mat Low F 45-59 574 913430 0.5258552
2014 ON Mat Low F 60-74 2227 657160 0.3208399
2014 ON Mat Low F 75+ 2999 300750 0.1533049
2014 ON Mat Low M 45-59 585 865105 0.5258552
2014 ON Mat Low M 60-74 2120 605290 0.3208399
2014 ON Mat Low M 75+ 3150 233935 0.1533049
YR could also be 2015-2017, DI is either Mat or Soc, Cat is either Low or High
Within each region, DI, and cat I am looking to calculate a crude and age-standardized rate with confidence intervals for females, males and total (understanding that I will probably have to reshape to get this).
YR Region DI Cat Sex Rate.t Estimate lci uci
2014 ON Mat Low M c.rate 0.0281 0.0257 0.0306
2014 ON Mat Low M s.rate 0.0231 0.0210 0.0246
2014 ON Mat Low F c.rate 0.0221 0.0201 0.0232
2014 ON Mat Low F s.rate 0.0241 0.0237 0.0246
2014 ON Mat Low T c.rate 0.0251 0.0220 0.0267
2014 ON Mat Low T s.rate 0.0255 0.0230 0.0260
Then, I would like to calculate risk ratios and risk differences (with CIs) that are crude and age-standardized within each region and DI - Low divided by High and Low minus High for females, males and total and eventually get the following output (realizing I will probably have to reshape).
YR Region DI Sex Crude_Std Measure Estimate lci uci
2014 ON Mat F crude RR 1.12 1.00 1.20
2014 ON Mat M crude RR 1.89 1.22 3.00
2014 ON Mat T crude RR 1.30 1.12 1.52
2014 ON Mat F crude RD 0.23 0.21 0.24
2014 ON Mat M crude RD -0.01 -0.05 0.03
2014 ON Mat T crude RD 0.10 0.05 0.15
2014 ON Mat F std RR 1.03 1.00 1.05
2014 ON Mat M std RR 1.50 1.42 1.60
….
I would like to find a way to do this in tidyverse without having to manually code functions (especially for calculating confidence intervals).
I've tried to play around with code from another post (see below)
df.copd2 %>%
group_by(YR,Region, DI, Cat, Sex) %>%
summarise(age_adjust = list(ageadjust.direct(count = n,
pop = Age15Pop, stdpop = weight))) %>%
mutate(age_adjust = map(age_adjust, as.data.frame.list)) %>%
unnest
This is working now. I will rerun a group_by
without sex to get total and then bind_rows
. It's still missing CIs for crude rate though - how do I get that?
I'm not sure how to get rate ratios and rate differences with CIs. I think the package dsrr
would be the way to go, but I'm not sure how to do it with tidyverse.
Any help is greatly appreciated!