I have two variables that change over time. Addition is known, but as for substraction, I know only the sum that needs to be substracted, while I want to substract it from both variables proportionally.
Here is an example of a dataset
df = data.frame(id = c(rep(1,5), rep(2,3)),
ord = c(1:5, 1:3),
a = c(10, NA, 20, 0, NA, 0, 15, NA),
b = c(0, NA, 0, 15, NA, 10, 0, NA),
substract = c(NA, -5, NA, NA, -10, NA, NA, -15)) %>%
rowwise() %>%
mutate(all = sum(c(a, b, substract), na.rm = TRUE)) %>%
arrange(id, ord) %>%
group_by(id) %>%
mutate(all = cumsum(all)) %>%
ungroup()
So, I want to replace NA
in a
and b
with the value from substract
, multiplied by cumulative sum of a
and b
respectively, divided by value in all
right before the NA
. The problem is, that after each replacement, the next replacement should take into account all the previous replacements, as cumulative sum of a
and b
will change after that.
I have a solution with a while
loop that works, but is highly ineffective. The original dataset is huge, so it is not an option for me, but it might give some additional insight on what I would like to achieve.
test = df %>%
group_by(id)
while(any(is.na(test$a))){
test = test %>%
mutate(across(c("a", "b"), ~ ifelse(is.na(.x), lag(cumsum(.x)) / lag(all) * substract, .x)))
}
Could anyone suggest a more effective solution? Like, if there was any way to make mutate
function save changes after each mutation (so it does not need to be put into a while loop) or something?
EDIT: user63230 suggested using recursive functions. It does seem to be what I wanted, but I still have difficulties in their application to my case. accumulate2()
accepts only 3-arguments function and does not seem to work with lag()
(as I need not only the previous value of the current variable), so it seems to be not enough. Maybe there is a way to make it work, but I have not discovered it yet. Any help would be very appreciated.