1

I am trying to create a function to calculate the mean of the last n rows in a data.frame for each row using tidyverse's syntax. So, the way I see it is using lead but dynamically with an n value. Something like df %>% mutate(mean_5 = mean_last(value, 5) where each row's value will be the mean of its own value and the last 4 values.

Manually, for a n=3case, it would be something like:

df %>% mutate(av3 = (y + lead(y, 1) + lead(y, 2))/3)

I've tried using mean(lead(value, n=1:3)) but it doesn't work either.

Example:

df <- data.frame(x = 1:6, y = c(10, 4, 8, 6, 5, 1))
df %>% mutate(av3 = (y + lead(y, 1) + lead(y, 2))/3)

Will return df with a new av3 column with values c(7.2, 6.0, 6.3, 4.0, NA, NA)

I expect to get that output but automatically without having to type n-1 times the lead function. Just something like new_mean = mean(lead(value, 1:10))

And last, it would be super nice if it lets you use the group_by function!

Community
  • 1
  • 1
Bernardo
  • 461
  • 7
  • 20
  • 1
    There you go @akrun :) – Bernardo Mar 28 '19 at 15:57
  • 3
    `lead` and `lag` weren't written with this in mind---I think any solution you come up with using them will be pretty hacky. Why not just use standard rolling functions like those mentioned in the [r-faq for moving averages](https://stackoverflow.com/q/743812/903061)? – Gregor Thomas Mar 28 '19 at 15:58
  • 1
    Personally, I'd recommend the `data.table` implementation: `df %>% mutate(av3 = data.table::frollmean(y, n = 3, align = "left"))`. – Gregor Thomas Mar 28 '19 at 16:21
  • 1
    *"And last, it would be super nice if it lets you use the group_by function!"* Anything you put inside `mutate` will use the `group_by` function. Mutate handles that, the functions inside `mutate` don't have to worry about it. – Gregor Thomas Mar 28 '19 at 16:22

0 Answers0