2

Objective: calculate the mean of each pair of values by group in a dataframe

Example data:

mtcars[1:10, 1:2] %>%
   arrange(cyl)

 mpg cyl
22.8   4
24.4   4
22.8   4
21.0   6
21.0   6
21.4   6
18.1   6
19.2   6
18.7   8
14.3   8

Expected Output:

mpg   cyl   avg
22.8   4     -
24.4   4     23.6
22.8   4     23.6
21.0   6     -
21.0   6     21.0
21.4   6     21.2
18.1   6     19.75
19.2   6     18.65
18.7   8     -
14.3   8     16.5 

Thought I could use dplyr::lag or dplyr::slice to do this, but haven't got it working. Not married to dplyr, any solution would be appreciated...

Attempt:

group_by(mtcars, gear) %>%
   summarise(pairmean=mean(c(mpg, lag(mpg))))
Minnow
  • 1,733
  • 2
  • 26
  • 52
  • 1
    I never thought of this as a moving average problem, though I do find the answers to your question useful. Didn't find your question based on apparently lack of nomenclature similarity. Fwiw, for my purposes the order of values is not relevant (despite my use of arrange). – Minnow Mar 30 '17 at 18:35
  • @akrun It's right that OP didn't use the word moving average; but from the description one can only conclude that what OP wants is a moving average – Jaap Mar 30 '17 at 18:50

3 Answers3

2

Here is a base R method with ave.

# get data
temp <- mtcars[order(mtcars$cyl), c("mpg", "cyl")]

# calculate value pairwise means
temp$avg <- ave(temp$mpg, temp$cyl, FUN=function(i) (i + c(NA, head(i, -1))) / 2)

Here, mpg is added to a lagged value and divided by 2. The lagged value is constructed using head with the final element removed (-1) and an NA concatenated to the initial position.

This returns

head(temp)
                mpg cyl   avg
Datsun 710     22.8   4    NA
Merc 240D      24.4   4 23.60
Merc 230       22.8   4 23.60
Fiat 128       32.4   4 27.60
Honda Civic    30.4   4 31.40
Toyota Corolla 33.9   4 32.15
lmo
  • 37,904
  • 9
  • 56
  • 69
1

We need a rolling mean (rollmean from zoo) grouped by 'cyl'

library(dplyr)
library(zoo)
mtcars[1:10, 1:2] %>%
        arrange(cyl) %>% 
        group_by(cyl) %>%
        mutate(avg = rollmean(mpg, 2, na.pad = TRUE, align = "right"))
akrun
  • 874,273
  • 37
  • 540
  • 662
1

With dplyr...

mtcars %>% group_by(cyl) %>% mutate(pairmean=(mpg+lag(mpg))/2)
Andrew Gustar
  • 17,295
  • 1
  • 22
  • 32