2

I am trying to use a lag value of a previous row, which needs to be calculated from the previous row (unless its first entry).

I was trying something similar to:

test<-data.frame(account_id=c(123,123,123,123,444,444,444,444),entry=c(1,2,3,4,1,2,3,4),beginning_balance=c(100,0,0,0,200,0,0,0),
                 deposit=c(10,20,5,8,10,12,20,4),running_balance=c(0,0,0,0,0,0,0,0))

test2<-test %>%
  group_by(account_id) %>%
  mutate(running_balance = if_else(entry==1, beginning_balance+deposit,
                                   lag(running_balance)+deposit))

print(test2)

the running balance should be 110,130,135,143,210,222,242,246

Bridget
  • 33
  • 3
  • 2
    Hi Bridget, welcome to Stack Overflow. It will be much easier to help if you provide at least a sample of your data with `dput(Account_id[1:20,])`. You can [edit] your question and paste the output. Please surround the output with three backticks (```) for better formatting. See [How to make a reproducible example](https://stackoverflow.com/questions/5963269/) for more info. – Ian Campbell Jun 18 '20 at 23:01

2 Answers2

0

For each account_id you can add first beginning_balance with cumulative sum of deposit.

library(dplyr)

test %>%
  group_by(account_id) %>%
  mutate(running_balance = first(beginning_balance) + cumsum(deposit))


#  account_id entry beginning_balance deposit running_balance
#       <dbl> <dbl>             <dbl>   <dbl>           <dbl>
#1        123     1               100      10             110
#2        123     2                 0      20             130
#3        123     3                 0       5             135
#4        123     4                 0       8             143
#5        444     1               200      10             210
#6        444     2                 0      12             222
#7        444     3                 0      20             242
#8        444     4                 0       4             246

Same thing using data.table :

library(data.table)
setDT(test)[, running_balance := first(beginning_balance) + cumsum(deposit), account_id]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • I had only tested this with 1 account id, but now that I am applying it to the full data set, it doesnt seem to be grouping by Account_id. any idea why? – Bridget Jun 23 '20 at 17:48
  • @Bridget Probably you have `plyr` loaded along with `dplyr` which is masking `mutate` function. You can restart R and load only `dplyr` or use `dplyr::mutate` explicitly. `data.table` solution should work though irrespective. – Ronak Shah Jun 23 '20 at 22:47
0

Using for-loops for each unique account_id and adding cumulative sum for each id.

for ( i in unique (test$account_id)) {

  test$running_balance [test$account_id == i] <- cumsum(test$beginning_balance[test$account_id == i]+test$deposit[test$account_id == i])

}

print (test)

     account_id entry beginning_balance deposit running_balance
1        123     1               100      10             110
2        123     2                 0      20             130
3        123     3                 0       5             135
4        123     4                 0       8             143
5        444     1               200      10             210
6        444     2                 0      12             222
7        444     3                 0      20             242
8        444     4                 0       4             246
Arun kumar mahesh
  • 2,289
  • 2
  • 14
  • 22