Add a column of group averages in R

Question

In this (made up) data frame df, I want to show the average age grouped by ID.

Output right now:

Desired output:

   ID age  meanage
    1  20   22.5
    1  25   22.5
   20  40   35.0
   20  35   35.0
   20  30   35.0
  356  27   27.0
 4001  22   48.0
 4001  49   48.0
 4001  46   48.0
 4001  75   48.0
55555  50   52.0
55555  54   52.0

One solution that works is

for (i in min(df$ID):max(df$ID))
{
  df$meanage[df$ID == i] <- mean(df$age[df$ID == i])
}

However, this is really slow, especially for a very large data set. Is there a more efficient way to do this?

akrun · Accepted Answer · 2015-02-02T12:47:43.333

2

Here is a data.table method to find the average "age" by "ID"

library(data.table)
setDT(df)[,meanage:=mean(age)  , by = ID][]

Or a base R approach

df$meanage <- with(df, ave(age, ID))

Or using dplyr (contributed by @David Arenburg)

library(dplyr)
df %>% 
   group_by(ID) %>%
   mutate(meanage = mean(age))

edited Feb 02 '15 at 12:47

answered Feb 02 '15 at 12:42

akrun

874,273
37
540
662

score 0 · Answer 2 · answered Feb 02 '15 at 17:51

You could also use plyr

library(plyr)
x <- ddply(df,.(ID),summarise,ave=mean(age))
#same as (I think):
x <- ddply(df,~ID,summarise,meanage=mean(age))

If you want the average duplicated (as in your example) you can merge the new result onto it (there may be a way to avoid that step).

Add a column of group averages in R

2 Answers2