I am trying to combine 4 separate files of emergency department visits to get the total count for each day in the dataset. Each file corresponds to a year starting 7/1 0700 and ending 7/1 0659. Each row in the file corresponds to an individual patient who arrived on a given day.
I used the code below to create a new data frame that lists each date along with the total patients seen on that date. The problem I'm trying to solve is that I have two separate 7/1 entries for each year (one corresponds to the midnight-0659 count and the other corresponds to the 0700-2359 count).
What is the best approach to have those two 7/1 entries combined into a single row with the total count for that date?
This is what the counts18.21 data look like with a 7/1 example in the middle: |date|n| |---|---| | 2018-06-28|206 |2018-06-29|209 |2018-06-30|189 |2018-07-01|44 |2018-07-01|174 |2018-07-02|222 |2018-07-03|203
Thanks!
#create counts
count_fy18 <- fy18.cleaned %>% count(arrive.date)
count_fy19 <- fy19.cleaned %>% count(arrive.date)
count_fy20 <- fy20.cleaned %>% count(arrive.date)
count_fy21 <- fy21.cleaned %>% count(arrive.date)
#Join datasets
fy18_19 = fy18.cleaned %>% full_join(fy19.cleaned)
fy20_21 = fy20.cleaned %>% full_join(fy21.cleaned)
fy18_21 = fy18_19 %>% full_join(fy20_21)
count.1819 = full_join(count_fy18,count_fy19)
count.2021 = full_join(count_fy20,count_fy21)
count18.21 = full_join(count.1819,count.2021)
dput(count18.21[1:10,])
structure(list(arrive.date = structure(c(1498867200, 1498953600,
1499040000, 1499126400, 1499212800, 1499299200, 1499385600, 1499472000,
1499558400, 1499644800), tzone = "UTC", class = c("POSIXct",
"POSIXt")), n = c(173L, 197L, 190L, 165L, 225L, 223L, 214L, 218L,
200L, 245L)), row.names = c(NA, -10L), class = c("tbl_df", "tbl",
"data.frame"))