1

I am trying to calculate a series of unadjusted odds ratios for my dataframe in R. Some of the crosstables contain zero cell sizes, and from what I've read I believe using the Haldane-Anscombe correction to add 0.5 to all cell sizes in those tables is an appropriate next step.

I can't share my actual dataset, so I created a random little sample dataset below that treats "male" as the outcome and "eye_color" and "hair_color" as the predictors, showing how I'm currently calculating the ORs below. In this sample dataset, there are 0 people who are male with green eyes.

      #Creating sample dataset
        male <- c(1,1,1,1,1,1,1,1,0,0,0,0,0,0,0)
        eye_color <- c("blue","blue","blue","blue","brown","brown","brown","brown","blue","blue","blue","brown","brown","green","green")
        hair_color <- c("brown","brown","brown","black","black","brown","brown","blonde","blonde","blonde","black","brown","brown","black","black")
        df <- data.frame(male, eye_color, hair_color)
        
      #Crosstable stratified by male
        crosstable(df, c(eye_color,hair_color), by=male, percent_digits=2) -> ctable
        
      #Calculating list of ORs
        vars <- c('eye_color','hair_color')
        cols <- df[vars]
        ors_list <- lapply(as.list(cols), function(x) glm(male ~ x, data=df, family=binomial(link="logit")))
        
      #Creating tibble from list of ORs
        do.call(rbind, lapply(ors_list, broom::tidy, exponentiate=TRUE, conf.int=TRUE)) -> ors

The only examples of the Haldane-Anscombe correction I've been able to find show people manually adding 0.5 to all cell sizes and use different methods of calculating the ORs. Is there a way to incorporate the correction into the code I'm using? Or a way to apply the correction using different code but generating the same result?

Also, please let me know if Haldane-Anscombe is not appropriate for this case and there's another method I should be considering. Thank you!

Shannon
  • 93
  • 8

0 Answers0