Full disclosure- I inherited this code and tried to Frankenstein it enough to make it work. It isn't perfect.
I have a series of Excel workbooks I'm iterating through to extract financial data for a group of medical practices. The workbooks have a tab for each month. I used lapply() to iterate over the sheets to pull only the months in each quarter. One of the practices only has data from January and February of 2022 so I wouldn't expect that to show up for the 4th quarter update we just ran. However, that data is there.
library(tidyverse)
library(readxl)
library(openxlsx)
df1 <- data.frame("Medication" = seq(1:50),
"Total WAC" = seq(51:100))
df2 <- data.frame("Medication" = seq(1:50),
"Total WAC" = seq(51:100))
list_of_datasets <- list("January" = df1, "February" = df2)
write.xlsx(list_of_datasets, file = "C:/MC_report.xlsx")
current_month <- lubridate::month(as.Date(Sys.Date(), format = "%Y/%m/%d"))
current_year <- lubridate::year(as.Date(Sys.Date(), format = "%Y/%m/%d"))
Q1 <- c("January", "February", "March")
Q2 <- c("April", "May", "June")
Q3 <- c("July", "August", "September")
Q4 <- c("October", "November", "December")
quarter <- switch(current_month,
"1" = Q4, "2" = Q4, "3" = Q4,
"4" = Q1, "5" = Q1, "6" = Q1,
"7" = Q2, "8" = Q2, "9" = Q2,
"10" = Q3, "11" = Q3, "12" = Q3)
year <- ifelse(current_month %in% c(1, 2, 3), current_year - 1, current_year)
names = c("Medication", "Total WAC")
MCPath22 = "C:/MC_report.xlsx"
MClist22 = lapply(quarter, function(x){ # this function is repeated for each practice. I won't paste it over and over
dat = read_excel(MCPath22, sheet = x, skip = 1)[c(1,2)] # 1 is 'Medication' 2 is "Total WAC'
names(dat) = names
dat$Month = x
dat$Year = year
dat$Location = "Medical Center"
return(dat)
})
MC_newdata = do.call(rbind, MClist22) %>%
select( Medication, `Total WAC`, Month, Year, Location) %>%
mutate(Date.Added = Sys.time())
data = rbind(MC_newdata, DHP_newdata, Lex_newdata, Derm_newdata, Onc_newdata, oldvalues) %>%
filter(!is.na(Medication)) #includes all the practices
write_csv(data,"PAP Data.csv")
I just ran this again and all facilities save for the one with only January and February tabs are running correctly. It throws an error that 'October" not found, which is expected. I can stop that piece in R Studio and the script completes. And then Jan and Feb are in the output. Any idea why it's outputting the wrong data?