4

I am trying to add another column to a dataframe where the new column is a function of the previous value in the new column and a current row value. I have tried to strip out irrelevant code and stick in easy numbers so that I might understand answers here. Given the following dataframe:

  x
1 1
2 2
3 3
4 4
5 5

The next column (y) will add 5 to x and also add the previous row's value for y. There's no previous value for y in the first row, so I define it as 0. So the first row value for y would be x+5+0 or 1+5+0 or 6. The second row would be x+5+y(from 1st row) or 2+5+6 or 13. The dataframe should look like this:

  x  y
1 1  6
2 2 13
3 3 21
4 4 30
5 5 40

I tried this with case_when() and lag() functions like this:

test_df <- data.frame(x = 1:5)
test_df %>% mutate(y = case_when(x==1 ~ 6,
+                                    x>1 ~ x+5+lag(y)))

Error: Problem with mutate() column y. ℹ y = case_when(x == 1 ~ 6, x > 1 ~ x + 5 + lag(y)). x object 'y' not found Run rlang::last_error() to see where the error occurred.

I had thought y was defined when the first row was calculated. Is there a better way to do this? Thanks!

CosmicSpittle
  • 172
  • 2
  • 7

2 Answers2

6

You don't need lag here at all. Just a cumsum should suffice.

test_df %>% mutate(y = cumsum(x + 5))

#>   x  y
#> 1 1  6
#> 2 2 13
#> 3 3 21
#> 4 4 30
#> 5 5 40

Data

test_df <- data.frame(x = 1:5)
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
3

We can also use purrr::accumulate here:

library(purrr)

df %>% mutate(y = accumulate(x+5, ~.x + .y))

  x  y
1 1  6
2 2 13
3 3 21
4 4 30
5 5 40

We can also use accumulate with regular base R synthax:

df %>% mutate(y = accumulate(x+5, function(x, y) {x + y}))
GuedesBF
  • 8,409
  • 5
  • 19
  • 37
  • This is the path I tried to go down but don't fully understand the r documentation for accumulate. Would you mind explaining the arguments to accumulate here, specifically the ~.x+.y? (tilda and period) Typically I have a function rather than just adding 5 to x. – CosmicSpittle Oct 30 '21 at 02:00
  • 1
    NM....there is an explanation here: https://stackoverflow.com/questions/62488162/use-of-tilde-and-period-in-r Thanks for your help! – CosmicSpittle Oct 30 '21 at 02:22
  • I added an equilvalen version with basic R synthax,, so it may be clearer for begginers – GuedesBF Oct 30 '21 at 02:57