0

The script below is used to create a loop, create certain outputs and separately export them. I would like to modify the script to append the variables of interest by their stratifiers - i.e. combine health_mental, overall_health and outlook_life outputs when they're stratified by age, and another output when they're stratified by sex. I would like to then export those files out in excel format. Thanks in advance for the help!

library(tidyverse)
library(dplyr)

df <- data.frame (overall_health = c("poor", "good", "excellent", "poor", "good", "poor", "poor", "excellent"),
                  outlook_life = c("good", "excellent", "excellent", "poor", "excellent", "poor", "excellent", "poor"),
                  health_mental = c("poor", "poor", "excellent", "poor", "poor", "poor", "excellent", "good"),
                  sex = c("F", "M", "M", "F", "F", "M", "F", "M"),
                  age_group = c("50-54", "60-64", "80+", "70-74", "40-44", "45-49", "60-64", "65-69"),
                  income = c("$<40,000", "$50,000-79,000", "$80,000-110,000", "$111,000+", "$<40,000", "$<40,000", "$50,000-79,000", "$80,000-110,000"),
                  education = c("HS", "College", "Bachelors", "Masters", "HS", "College", "Bachelors", "Masters"),
                  geography= c("area1", "area2", "area1", "area2", "area2", "area1", "area2", "area1"))
geos <- unique(df$geography)

vars <- c("health_mental", "overall_health", "outlook_life")

combinations <- expand.grid(c("age_group", "sex"), vars, stringsAsFactors = F)

combinations$label <- paste(combinations$Var1, combinations$Var2, sep = "_")

output <- list()

for (geo in geos){
  for (combo in 1:nrow(combinations)){
    output_label <- paste(combinations[[combo,"label"]],geo,sep="_")
    temp <- df %>%
                filter(geography == geo) %>%
                group_by_at(combinations[combo,1:2] %>% unlist()%>%unname()) %>%
                summarise(count = n(),
                          total = nrow(.),
                          proportion = count/nrow(.) *100)
    output[[output_label]] <- temp
    
  }
}

Expected output - example

enter image description here

user438383
  • 5,716
  • 8
  • 28
  • 43
  • summarize by group: https://stackoverflow.com/q/11562656/3358272 – r2evans Feb 06 '23 at 18:07
  • That post doesn't address what I need. – R_coder_new Feb 06 '23 at 18:46
  • 1
    You want a count, total ("N"?), and proportion for each combination of our `outcome`, `response`, and `agegrp` variables, how is that not summarizing by group? Please provide expected output (filled in) given this sample data, it will likely explain why I don't see the issue. – r2evans Feb 06 '23 at 19:00
  • I already have the summarize function in my loop script. I need to modify the script to create data.frame files that append the output[output_label] when it's stratified by age_group, and another when stratified by sex. To reiterate, the output file is currently formatted as a list.. I need to convert into 2 different data.frame, when stratified by age group and another by sex – R_coder_new Feb 06 '23 at 19:22
  • 1
    My first thought then was to start with `lapply(split(output, sub("_.*", "", names(output))), bind_rows, .id = "outcome")`, which produces two frames, but because of the disparate column names, they aren't very useful. You refer to appending to an excel file, are you intending to have 12 separate "tables" in the xlsx? – r2evans Feb 06 '23 at 19:49

0 Answers0