0

Hi I want to create an experience variable from a dataframe looking similar to this:

ID    Year   Experience gained that year
1      2000      1
1      2001      3
1      2002      1 

I am trying to do something similar to this:

NewDF <- DF %>% mutate(Cummulated_Exp = sum(DF[which(DF$ID == ID & DF$year < year),3]))`

resulting in

ID    Year   Experience gained that year      Cummulated_EXp
1      2000      1                                0
1      2001      3                                3
1      2002      1                                4

This wont work for a variety of reasons, likely something to do with the data i give to the sum() function as the ultimate error i always arrive at

Error in FUN(X[[i]], ...) : 
only defined on a data frame with all numeric variables

Thanks for the help

Laurenz
  • 1
  • 1
  • your logic is not clear `DF$ID == ID & DF$year < year`. can you describe the logic – akrun Jun 21 '21 at 18:38
  • I want to add all the experience that occurred for the same unit (DF$ID == ID) in the previous years (DF$year < year). So I guess the output table is slightly wrong and the logic for the output presented would be (DF$year <= year) but same thing. – Laurenz Jun 21 '21 at 18:42
  • I think you want `DF %>% arrange(ID, Year) %>% group_by(ID) %>% mutate(Cum_Exp = cumsum(\`Experience gained that year\`))` – Gregor Thomas Jun 21 '21 at 18:42
  • You need a `arrange(ID, Year) %>% group_by(ID) %>% mutate(Cumulated_Exp = cumsum(`Experience gained that year`)` – akrun Jun 21 '21 at 18:43
  • Thank you that solves it, not quite sure what went wrong but thanks!! – Laurenz Jun 21 '21 at 18:47
  • I've closed as a duplicate of "sum by group" - you just need to replace `sum` with `cumsum`. Two big misconceptions in your attempt: you want a sum for each `ID`, so you need `group_by(ID)`. The other big miss is that with `dplyr` you almost **never** need to use the data frame name inside the `dplyr` functions. Using `DF[which(DF$ID == ID & DF$year < year),3]` instead of just using column names will break groupings and often cause errors. – Gregor Thomas Jun 21 '21 at 19:41

0 Answers0