0

I have a dataset like this

dat <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
              Col1 = rep(c("B","S","S","B"), 4),
              Col2 = rep(c(1,2,3,4), 4),
              Col3 = rep(c(0.1,0.2,0.3,0.4), 4))

I'm trying to create a fourth column as shown below

dat1 <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
               Col1 = rep(c("B","S","S","B"), 4),
               Col2 = rep(c(1,2,3,4), 4),
               Col3 = rep(c(0.1,0.2,0.3,0.4), 4),
               Col4 = c(1, 0.8, 1.26, 4, 1, 0.8, 1.26, 4, 1, 0.8, 1.26, 4))

What I have tried till now,

d1 <- dat %>% 
  group_by(Col0) %>% 
  mutate(Col4 = if_else(Col1 == 'B', Col2,
                        if_else(Col1 == 'S' & lag(Col1 == "B"), lag(Col2)- Col3*lag(Col2), 0)))
d1

The Answer I'm getting is not what is in Col4, which is desired. The condition for getting Col4 is :

 if Col1 is B then get the value of Col2 as it is,

 if Col1 is S & Previous Value of Col1 is B then 1-(0.2*1) which is equal to 0.8
 if Col1 is S & Previous Value of Col1 is S as well then (1+0.8) -((1+0.8)*0.3) which is 1.26

Basically, it's like first performing difference and then performing cumulative sum including the difference and so on.

For now, I have taken a simple example to understand what I'm trying to achieve, the actual data-set has more than 1 million Obs. and Several Thousand Groups and what's worse is that the Combination of 'B' & 'S' alter. Like in some groups it's B,B,S,S and So on...

Any Help on this will be appreciated as I have tried several things other than if_else() and seen many conditional cumulative sum Ques as well but to no avail.

I think the same could be done easily in Excel with SUMIF() Function, but i need to do this with R

Prashant Dey
  • 23
  • 2
  • 8
  • Although https://stackoverflow.com/questions/14689424/use-a-value-from-the-previous-row-in-an-r-data-table-calculation uses a `data.table` , maybe that helps as well – CIAndrews Nov 27 '18 at 10:21
  • 1
    Well, @CIAndrews it's not that simple. Other than this I have seen Ques form https://stackoverflow.com/questions/16741683/conditional-cumulative-sum-in-r & https://stackoverflow.com/questions/49356656/multiple-conditional-cumulative-sum-in-r & https://stackoverflow.com/questions/42707796/conditional-cumulative-sum-with-dplyr but don't help solve the problem. This is done easily with excel. I tried but don't know how to apply the same in R – Prashant Dey Nov 27 '18 at 10:25

1 Answers1

0

It feels like you didn't complete the if_else:

dat <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
          Col1 = rep(c("B","S","S","B"), 4),
          Col2 = rep(c(1,2,3,4), 4),
          Col3 = rep(c(0.1,0.2,0.3,0.4), 4))
d1 <- dat %>% 
   group_by(Col0) %>% 
   mutate(Col4 = if_else(Col1 == 'B', Col2,
                    if_else(Col1 == 'S' & lag(Col1) == "B", 1-(0.2*1),
                            if_else(Col1 == 'S' & lag(Col1) == 'S',1.26,0))))
d1
CIAndrews
  • 1,046
  • 10
  • 19