I am looking for a way to fit the following two dplyr operations into one and make my operation shorter and more elegant:
Overall, I want to create a sum across counter
if conditions x
and y
are met and then allocate this sum to the 4 coming quarters (quarter
), excluding the current quarter. There are cases where this leads to an overlap, which needs to be summed up. I cannot use dplyr's lag()
because due to the summarise function, I do not have all consequential quarters in the output. This is why I had to do a "detour" and split the dplyr operation into two. I am now looking for an elegant way to do it all in one operation and to avoid the intermediary step.
#Rep example
compid <- c(replicate(10, "A"), replicate(10, "B"))
quarter <- c(11:20, 11:20)
x <- c(0,1,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0)
counter <- c(0,1,2,0,1,0,0,0,0,0,1,1,0,1,0,0,0,0,0,0)
dat <- data.frame(compid, quarter, x, counter)
#First, I create the sum count
dat %>%
group_by(compid, quarter) %>%
filter(x == 1) %>%
summarise(sumcount = sum(counter)) %>%
ungroup() -> temp
#Then, I did not know how to opearte in dplyr. I want to eliminate this intermediary step.
temp1 <- temp
temp1$quarter <- temp1$quarter + 1
temp2 <- temp
temp2$quarter <- temp2$quarter + 2
temp3 <- temp
temp3$quarter <- temp3$quarter + 3
temp4 <- temp
temp4$quarter <- temp4$quarter + 4
temp <- rbind(temp1, temp2, temp3, temp4)
#Lastly, I went back to dplyr to consolidate and refine the data
temp %>%
group_by(compid, quarter) %>%
summarise(sumcount = sum(sumcount)) %>%
right_join(dat, by = c("compid", "quarter")) %>%
mutate(sumcount = ifelse(is.na(sumcount), 0, sumcount))