Sum previous instances that match the same ID

Question

I have this example dataset:

    df <- data.frame(ID = c(1, 1, 1, 2, 2, 2), A = c("2018-10-12",
          "2018-10-12", "2018-10-13", "2018-10-14", "2018-10-15", "2018-10-16"), 
          B = c(1, 5, 7, 2, 54, 202))
  ID       A      B
1  1 2018-10-12   1
2  1 2018-10-12   5
3  1 2018-10-13   7
4  2 2018-10-14   2
5  2 2018-10-15  54
6  2 2018-10-16 202

What I'm trying to do is create a column C that is the sum of B but only for dates before each respective row. For instance, the output I'm seeking is:

  ID       A      B   C
1  1 2018-10-12   1   1
2  1 2018-10-12   5   6
3  1 2018-10-13   7   13
4  2 2018-10-14   2   2
5  2 2018-10-15  54   56
6  2 2018-10-16 202   258

I generally will use subsets to do individual sumifs when I have those questions, but I'm not sure how to do this in a new column.

My end goal is to determine the dates that each ID (if applicable) crosses 50.

Thanks!

score 1 · Answer 1 · answered Nov 16 '18 at 19:15

1

We can do a group by cumulative sum to create the 'C' column

library(dplyr)
df %>% 
  group_by(ID) %>% 
  mutate(C = cumsum(B))

Or use data.table

library(data.table)
setDT(df)[, C := cumsum(B), by = ID]

or with base R

df$C <- with(df, ave(B, ID, FUN = cumsum))

answered Nov 16 '18 at 19:15

akrun

874,273
37
540
662

Thanks! This seems to work. I used the base R method, however it seems that I need to have the data sorted by investment date for it to work correctly, right? – deslaur Nov 16 '18 at 19:55
@deslaur You can order by `investmentdate` before and then apply the `cumsum` on the `order`ed data – akrun Nov 16 '18 at 19:56

Sum previous instances that match the same ID

1 Answers1