Here is a data.table
solution:
library(data.table);
# make the data
df = data.table(
worker_id = c(111, 111, 222, 222),
difference = c(5, 3, 5, 2)
);
# calculate mean difference
df_new = df[
,
# make a new column called "difference_mean_by_worker_id" to be the mean of
# "difference"
"difference_mean_by_worked_id" := mean(x = difference),
# grouped by worker_id
by = "worker_id"
];
df_new;
worker_id difference difference_mean_by_worked_id
1: 111 5 4.0
2: 111 3 4.0
3: 222 5 3.5
4: 222 2 3.5
This script calculates the mean for the distances in a group partitioned by worker_id
. Hope this helps!