Calculate a new row with mean by one argument in the data frame

Question

I want to calculate a new row (difference_mean_by_worker_id) with the mean (difference) by one argument (worker_id) in my existing data frame. The mean of the difference of every single worker_id should be the same in the new row. Like this:

enter image description here

Thanks, Tim

Thats a very basic task. U should find plenty input on the web. I would recommend you use package data.table. Or build in tapply function. — Andre Elrico, Sep 27 '17 at 08:43
Just `ave(df$difference,df$worker_id)`. Don't post images; rather copy/paste your dataset (or part of it) so everybody can use it. — nicola, Sep 27 '17 at 08:44

score 0 · Answer 1 · answered Sep 27 '17 at 09:00

Here is a data.table solution:

library(data.table);

# make the data
df = data.table(
  worker_id = c(111, 111, 222, 222),
  difference = c(5, 3, 5, 2)
);

# calculate mean difference
df_new = df[
  ,
  # make a new column called "difference_mean_by_worker_id" to be the mean of
  # "difference"
  "difference_mean_by_worked_id" := mean(x = difference),
  # grouped by worker_id
  by = "worker_id"
];

df_new;

       worker_id difference difference_mean_by_worked_id
1:       111          5                          4.0
2:       111          3                          4.0
3:       222          5                          3.5
4:       222          2                          3.5

This script calculates the mean for the distances in a group partitioned by worker_id. Hope this helps!

thanks for your comments and your help :) – Matthias Oct 02 '17 at 07:53 — Matthias, Oct 02 '17 at 07:53

Calculate a new row with mean by one argument in the data frame

1 Answers1